Blockchain, Ethereum and Smart Contracts

Usually, I try to make all the content I create original, based on readings, investigations, or just based on situations or questions I have heard or have been asked while working. Today, this article is a little bit different. It is not a copy from somewhere but, it has pieces that have been taken from other articles. You can find all the references at the end of this article. This article was just built as a part of some hack days I enjoyed recently where I decided to dig a little bit deeper on Blockchain, Ethereum, and Smart Contracts. This article is my field notes just in case someone else finds them useful.

Blockchain

What is Blockchain?

A blockchain is essentially an immutable digital ledger of transactions that is duplicated and distributed (shared) across the entire network of computer systems on the blockchain. Each block in the chain contains several transactions, and every time a new transaction occurs on the blockchain, a record of that transaction is added to every participant’s ledger. Blockchain facilitates the process of recording transactions and tracking assets in a business network. An asset can be tangible (i.e., a house, car, cash, land) or intangible (i.e., intellectual property, patents, copyrights, branding). Virtually anything of value can be tracked and traded on a blockchain network, reducing risk and cutting costs for all involved.

A decentralised database managed by multiple participants is known as Distributed Ledger Technology (DLT).

Blockchain is a type of DLT in which transactions are recorded with an immutable cryptographic signature called a hash.

The properties of Distributed Ledge Technology are:

  • Programmable: A blockchain is programmable i.e., smart contracts.
  • Secure: All records are individually encrypted.
  • Anonymous: The identity of participants is either anonymous or pseudo-anonymous.
  • Unanimous: All members of the network agree to the validity of each of the records.
  • Time-stamped: A transaction timestamp is recorded on a block.
  • Immutable: Any validated records are irreversible and cannot be changed.
  • Distributed: All network participants have a copy of the ledger for complete transparency.

How is Blockchain used?

Blockchain technology is used for many different purposes, from providing financial services to administering voting systems.

Cryptocurrency

The most common use of blockchain today is as the backbone of cryptocurrencies, like Bitcoin or Ethereum. When people buy, exchange or spend cryptocurrency, the transactions are recorded on a blockchain. The more people use cryptocurrency, the more widespread blockchain could become.

Banking

Beyond cryptocurrency, blockchain is being used to process transactions in fiat currency, like dollars and euros. This could be faster than sending money through a bank or other financial institution as the transactions can be verified more quickly and processed outside of normal business hours.

Asset Transfers

Blockchain can also be used to record and transfer the ownership of different assets. This is currently very popular with digital assets like NFTs, a representation of ownership of digital art and videos.

However, blockchain could also be used to process the ownership of real-life assets, like the deed to real estate and vehicles. The two sides of a party would first use the blockchain to verify that one owns the property and the other has the money to buy; then they could complete and record the sale on the blockchain. Using this process, they could transfer the property deed without manually submitting paperwork to update the local county’s government records; it would be instantaneously updated in the blockchain.

Smart Contracts

Another blockchain innovation is self-executing contracts commonly called “smart contracts”. These digital contracts are enacted automatically once conditions are met. For instance, a payment for a good might be released instantly once the buyer and seller have met all specified parameters for a deal. Another example of this is to automate legal contracts such as “A properly coded smart legal contract on a distributed ledger can minimize, or preferably eliminate, the need for outside third parties to verify performance.”

Supply Chain Monitoring

Supply chains involve massive amounts of information, especially as goods go from one part of the world to the other. With traditional data storage methods, it can be hard to trace the source of problems, like which vendor poor-quality goods came from. Storing this information on the blockchain would make it easier to go back and monitor the supply chain, such as with IBM’s Food Trust, which uses blockchain technology to track food from its harvest to its consumption.

Voting

Experts are looking into ways to apply blockchain to prevent fraud in voting. In theory, blockchain voting would allow people to submit votes that could not be tampered with as well as would remove the need to have people manually collect and verify paper ballots.

Healthcare

Health care providers can leverage blockchain to securely store their patients’ medical records. When a medical record is generated and signed, it can be written into the blockchain, which provides patients with the proof and confidence that the record cannot be changed. These personal health records could be encoded and stored on the blockchain with a private key, so that they are only accessible by certain individuals, thereby ensuring privacy.

Advantages of Blockchain

Higher Accuracy of Transactions

Because a blockchain transaction must be verified by multiple nodes, this can reduce error. If one node has a mistake in the database, the others would see it is different and catch the error.

In contrast, in a traditional database, if someone makes a mistake, it may be more likely to go through. In addition, every asset is individually identified and tracked on the blockchain ledger, so there is no chance of double spending it (like a person overdrawing their bank account, thereby spending money twice).

No Need for Intermediaries

Using blockchain, two parties in a transaction can confirm and complete something without working through a third party. This saves time as well as the cost of paying for an intermediary like a bank. This can bring greater efficiency to all digital commerce, increase financial empowerment to the unbanked or underbanked populations of the world and power a new generation of internet applications as a result.

Extra Security

Theoretically, a decentralized network, like blockchain, makes it nearly impossible for someone to make fraudulent transactions. To enter forged transactions, they would need to hack every node and change every ledger. While this is not necessarily impossible, many cryptocurrency blockchain systems use proof-of-stake or proof-of-work transaction verification methods that make it difficult, as well as not in participants’ best interests, to add fraudulent transactions.

More Efficient Transfers

Since blockchains operate 24/7, people can make more efficient financial and asset transfers, especially internationally. They do not need to wait days for a bank or a government agency to manually confirm everything.

Disadvantages of Blockchain

Limit on Transactions per Second

Given that blockchain depends on a larger network to approve transactions, there is a limit to how quickly it can move. For example, Bitcoin can only process 4.6 transactions per second versus 1,700 per second with Visa. In addition, increasing numbers of transactions can create network speed issues. Until this improves, scalability is a challenge.

High Energy Costs

Having all the nodes working to verify transactions takes significantly more electricity than a single database or spreadsheet. Not only does this make blockchain-based transactions more expensive, but it also creates a large carbon burden for the environment.

Risk of Asset Loss

Some digital assets are secured using a cryptographic key, like cryptocurrency in a blockchain wallet. You need to carefully guard this key, there is no centralised entity that can be called to recover the access key.

Potential for Illegal Activity

Blockchain’s decentralization adds more privacy and confidentiality, which unfortunately makes it appealing to criminals. It is harder to track illicit transactions on blockchain than through bank transactions that are tied to a name.

Common misconception: Blockchain vs Bitcoin

The goal of blockchain is to allow digital information to be recorded and distributed, but not edited. Blockchain technology was first outlined in 1991 but it was not until almost two decades later, with the launch of Bitcoin in January 2009, that blockchain had its first real-world application.

The key thing to understand here is that Bitcoin merely uses blockchain as a means to transparently record a ledger of payments, but blockchain can, in theory, be used to immutably record any number of data points. As discussed above, this could be in the form of transactions, votes in an election, product inventories, state identifications, deeds to homes, and much more.

How does a transaction get into the blockchain?

For a new transaction to be added to the blockchain a few steps need to happen:

Authentication

The original blockchain was designed to operate without a central authority (i.e. with no bank or regulator controlling who transacts), but transactions still have to be authenticated.

This is done using cryptographic keys, a string of data (like a password) that identifies a user and gives access to their “account” or “wallet” of value on the system.

Each user has their own private key and a public key that everyone can see. Using them both creates a secure digital identity to authenticate the user via digital signatures and to “unlock” the transaction they want to perform.

Authorisation

Once the transaction is agreed between the users, it needs to be approved, or authorised, before it is added to a block in the chain.

For a public blockchain, the decision to add a transaction to the chain is made by consensus. This means that the majority of “nodes” (or computers in the network) must agree that the transaction is valid. The people who own the computers in the network are incentivised to verify transactions through rewards. This process is known as “proof of work”.

Proof of Work

Proof of Work requires the people who own the computers in the network to solve a complex mathematical problem to be able to add a block to the chain. Solving the problem is known as mining, and “miners” are usually rewarded for their work in cryptocurrency.

But mining is not easy. The mathematical problem can only be solved by trial and error and the odds of solving the problem are about 1 in 5.9 trillion. It requires substantial computing power which uses considerable amounts of energy. This means the rewards for undertaking the mining must outweigh the cost of the computers and the electricity cost of running them, as one computer alone would take years to find a solution to the mathematical problem.

The Problem with Proof of Work

To create economies of scale, miners often pool their resources together through companies that aggregate a large group of miners. These miners then share the rewards and fees offered by the blockchain network.

As a blockchain grows, more computers join to try and solve the problem, the problem gets harder and the network gets larger, theoretically distributing the chain further and making it even more difficult to sabotage or hack. In practice though, mining power has become concentrated in the hands of a few mining pools. These large organisations have the vast computing and electrical power now needed to maintain and grow a blockchain network based around Proof of Work validation.

Proof of Stake

Later blockchain networks have adopted “Proof of Stake” validation consensus protocols, where participants must have a stake in the blockchain – usually by owning some of the cryptocurrency – to be in with a chance of selecting, verifying, and validating transactions. This saves substantial computing power resources because no mining is required.

In addition, blockchain technologies have evolved to include “Smart Contracts” which automatically execute transactions when certain conditions have been met.

Risk in public blockchains

51% Attacks

Where blockchains have consensus rules based on a simple majority, there is a risk that malign actors will act together to influence the outcomes of the system. In the case of a cryptocurrency, this would mean a group of miners controlling more than 50% of the mining computing power can influence what transactions are validated and added (or omitted) from the chain. On a blockchain that uses the Proof of Work (PoW) consensus protocol system, a 51% attack can also take the form of a “rival” chain – including fraudulent transactions – being created by malicious parties.

Through their superior mining capacity, these fraudsters can build an alternative chain that ends up being longer than the “true” chain, and therefore – because part of the Bitcoin Nakamoto consensus protocol is “the longest chain wins” – all participants must follow the fraudulent chain going forward.

In a large blockchain like Bitcoin this is increasingly difficult, but where a blockchain has “split” and the pool of miners is smaller, as, in the case of Bitcoin Gold, a 51% attack is possible.

A 51% double spend attack was successfully executed on the Bitcoin Gold and Ethereum Classic blockchains in 2018, where fraudsters misappropriated millions of dollars of value.

Proof of Work vs Proof of Stake

A 51% attack on a new blockchain called Ethereum Classic in January 2019 prompted a change in strategic direction from Proof-of-Work (PoW) mining to Proof-of-Stake (PoS) voting for the Ethereum blockchain.

However, Proof of Stake is more vulnerable to schisms or splits known as “forks”, where large stakeholders make different decisions about the transactions that should comprise blocks and end up creating yet another new currency. Ethereum briefly tried this validation method but, due to forking issues, reverted back to Proof of Work. It is expected to introduce a revised Proof of Stake validation system in 2020.

Double Spending

There is a risk that a participant with, for example, one bitcoin can spend it twice and fraudulently receive goods to the value of two bitcoins before one of the providers of goods or services realises that the money has already been spent. But this is, in fact, an issue with any system of electronic money, and is one of the principal reasons behind clearing and settlement systems in traditional currency systems.

How Does Blockchain Work?

Blockchain consists of three important concepts: blocks, nodes, and miners.

Blocks

Every chain consists of multiple blocks and each block has three basic elements:

  • The data in the block.
  • A 32-bit whole number is called a nonce. The nonce is randomly generated when a block is created, which then generates a block header hash.
  • The hash is a 256-bit number wedded to the nonce. It must start with a huge number of zeroes (i.e., be extremely small).

When the first block of a chain is created, a nonce generates the cryptographic hash. The data in the block is considered signed and forever tied to the nonce and hash unless it is mined.

Miners

Miners create new blocks on the chain through a process called mining.

In a blockchain every block has its own unique nonce and hash, but also references the hash of the previous block in the chain, so mining a block isn’t easy, especially on large chains.

Miners use special software to solve the incredibly complex math problem of finding a nonce that generates an accepted hash. Because the nonce is only 32 bits and the hash is 256, there are roughly four billion possible nonce-hash combinations that must be mined before the right one is found. When that happens miners are said to have found the “golden nonce” and their block is added to the chain.

Making a change to any block earlier in the chain requires re-mining not just the block with the change, but all of the blocks that come after. This is why it’s extremely difficult to manipulate blockchain technology. Think of it is as “safety in maths” since finding golden nonces requires an enormous amount of time and computing power.

When a block is successfully mined, the change is accepted by all of the nodes on the network and the miner is rewarded financially.

Nodes

One of the most important concepts in blockchain technology is decentralization. No one computer or organization can own the chain. Instead, it is a distributed ledger via the nodes connected to the chain. Nodes can be any kind of electronic device that maintains copies of the blockchain and keeps the network functioning.

Every node has its own copy of the blockchain and the network must algorithmically approve any newly mined block for the chain to be updated, trusted, and verified. Since blockchains are transparent, every action in the ledger can be easily checked and viewed. Each participant is given a unique alphanumeric identification number that shows their transactions.

Combining public information with a system of checks and balances helps the blockchain maintain the integrity and creates trust among users. Essentially, blockchains can be thought of as the scalability of trust via technology.

Ethereum

What is Ehtereum?

Ethereum is often referred to as the second most popular cryptocurrency, after Bitcoin. But unlike Bitcoin and most other virtual currencies, Ethereum is intended to be much more than simply a medium of exchange or a store of value. Instead, Ethereum calls itself a decentralized computing network built on blockchain technology.

It is distributed in the sense that everyone participating in the Ethereum network holds an identical copy of this ledger, letting them see all past transactions. It is decentralised in that the network is not operated or managed by any centralised entity – instead, it is managed by all of the distributed ledger holders.

In Ethereum’s case, participants, when mining, are rewarded with cryptocurrency tokens called Ether (ETH).

Ether can be used to buy and sell goods and services, like Bitcoin. It’s also seen rapid gains in price over recent years, making it a de-facto speculative investment. But what’s unique about Ethereum is that users can build applications that “run” on the blockchain like software “runs” on a computer. These applications can store and transfer personal data or handle complex financial transactions. This is one of the big differences with Bitcoin, the Ethereum network can perform computations as part of the mining process, this basic computational capability turns a store of value and medium of exchange into a decentralized global computing engine and openly verifiable data store.

While inferring from the previous paragraphs, it is worth it to emphasise here the meanings of the terms Ethereum and Ether are different: Ether as a digital currency for financial transactions, while Ethereum is the blockchain network on which Ether is held and exchanged, and the network offers a variety of other functions outside of ETH.

As said before, the Ethereum network can also be used to store data and run decentralized applications. Rather than hosting software on a server owned and operated by Google or Amazon, where the one company controls the data, people can host applications on the Ethereum blockchain. This gives users control over their data and they have open use of the app as there’s no central authority managing everything.

Perhaps one of the most intriguing use cases involving Ether and Ethereum are self-executing contracts or so-called smart contracts. Like any other contract, two parties make an agreement about the delivery of goods or services in the future. Unlike conventional contracts, lawyers are not necessary: The parties code the contract on the Ethereum blockchain, and once the conditions of the contract are met, it self-executes and delivers Ether to the appropriate party.

Ethereum Benefits

  • Large, existing network: Ethereum is a tried-and-true network that has been tested through years of operation and billions of value trading hands. It has a large and committed global community and the largest ecosystem in blockchain and cryptocurrency.
  • Wide range of functions: Besides being used as a digital currency, Ethereum can also be used to process other types of financial transactions, execute smart contracts and store data for third-party applications.
  • Constant innovation: A large community of Ethereum developers is constantly looking for new ways to improve the network and develop new applications. Because of Ethereum’s popularity, it tends to be the preferred blockchain network for new and exciting (and sometimes risky) decentralised applications.
  • Avoids intermediaries: Ethereum’s decentralised network promises to let users leave behind third-party intermediaries, like lawyers who write and interpret contracts, banks that are intermediaries in financial transactions or third-party web hosting services.

Ethereum Disadvantages

  • Rising transaction costs: Ethereum’s growing popularity has led to higher transaction costs. Ethereum transaction fees, also known as “gas,” hit a record $23 per transaction in February 2021, which is great if you’re earning money as a miner but less so if you’re trying to use the network. This is because unlike Bitcoin, where the network itself rewards transaction verifiers, Ethereum requires those participating in the transaction to cover the fee.
  • Potential for crypto inflation: While Ethereum has an annual limit of releasing 18 million Ether per year, there’s no lifetime limit on the potential number of coins. This could mean that as an investment, Ethereum might function more like dollars and may not appreciate as much as Bitcoin, which has a strict lifetime limit on the number of coins.
  • The steep learning curve for developers: Ethereum can be difficult for developers to pick up as they migrate from centralised processing to decentralised networks.
  • Unknown future: Ethereum continues to evolve and improve, and the release of Ethereum 2.0 (1st of December 2020) holds out the promise of new functions and greater efficiency. This major update to the network, however, is creating uncertainty for apps and deals currently in use. Is someone familiar with migrations?

Ethereum as a Platform for Applications

Single Source of Truth

The Ethereum blockchain establishes a single source of truth for the ecosystem by maintaining a public transaction database. This shared database captures all transactions that occur between users and applications. Unique virtual addresses identify actors, and each transaction captures participating addresses. These addresses do not reveal any personal information, allowing users to remain anonymous. Transactions are batched together into blocks and validated by thousands of computers or “nodes” before joining the public ledger.

Once posted, no one can remove or alter transactions. Because records are sealed, bad actors cannot revert a transaction after the fact or tamper with the public record. Balances are known and transactions are settled in real-time, so the entire ecosystem is on the same page. This append-only approach assures users and applications that the current state of the blockchain is final and trustworthy.

Platform for Applications

In addition to providing a shared source of truth, the Ethereum blockchain provides a platform for applications. The ability to store and power these applications sets the Ethereum blockchain apart from the Bitcoin blockchain.

The Ethereum blockchain provides critical application infrastructure similar to web services. Thousands of nodes that maintain the single source of truth also supply resources like storage, processing power, and bandwidth. Because people run these nodes all over the globe, Ethereum is referred to as the “world computer” because the collective resources function as a single machine.

Ethereum is different from centralized web services in that transaction data and applications are distributed across thousands of nodes, rather than a few data centres controlled by a corporation. This feature, known as decentralization, leads to a highly redundant and resilient ecosystem that cannot be controlled or censored by a single entity.

The code for these applications lives on the blockchain just like the transactions created by users and other applications. As a result, applications deployed on Ethereum are open and auditable. These apps are also designed to interoperate with other apps in the ecosystem (a stark departure from the traditional “black box” approach for software). Once applications are deployed to Ethereum, they operate autonomously, meaning that they will execute the programs they are designed to run without manual intervention. They are controlled by code, not by individuals or companies. For this reason, applications are referred to as “smart contracts”.

A common analogy for smart contracts is vending machines. Vending machines are programmed to automatically deliver specific items based on specific inputs. Users punch in a code and receive the corresponding item. Likewise, smart contracts receive inputs from users, execute their programmed code, and produce an output.

Decentralized Finance (DeFi)

Because the most fundamental data recorded on the Ethereum blockchain is accounts, balances, and transactions, it makes sense to build financial applications on top of Ethereum. Users and other applications can freely interact with these financial applications because they are public and permissionless by default.

Simple financial services like lending and borrowing can be programmed and deployed on the blockchain as applications, allowing users (and other applications) to earn interest on digital assets or take out loans. Order book exchanges can autonomously pair buyers and sellers at no charge. Automated market makers can minimise spreads by creating liquidity pools that automatically rebalance according to predefined logic. Derivatives can be deployed on the blockchain so that contract terms are known and the underlying assets are priced in real-time.

Smart contracts

A smart contract is a self-executing contract with the terms of the agreement between buyer and seller being directly written into lines of code. The code and the agreements contained therein exist across a distributed, decentralised blockchain network. The code controls the execution, and transactions are trackable and irreversible. In other words, a smart contract is simply a piece of code that is running on Ethereum and can control valuable things like ETH or other digital assets.

Smart contracts permit trusted transactions and agreements to be carried out among disparate, anonymous parties without the need for a central authority, legal system, or external enforcement mechanism.

The fact that smart contracts are computer programs deployed on a blockchain network brings to the table some inherent characteristics:

  • Immutable: Once deployed, the code of a smart contract cannot change. Unlike traditional software, the only way to modify a smart contract is to deploy a new instance.
  • Deterministic: The outcome of the execution of a smart contract is the same for everyone who runs it, given the context of the transaction that initiated its execution and the state of the Ethereum blockchain at the moment of execution.
  • Ethereum Virtual Machine (EVM) context: Smart contracts operate with a very limited execution context. They can access their own state, the context of the transaction that called them, and some information about the most recent blocks.
  • Decentralized world computer: The EVM runs as a local instance on every Ethereum node, but because all instances of the EVM operate on the same initial state and produce the same final state, the system as a whole operates as a single “world computer” (as mentioned before).

Languages to write smart contracts

  • LLL: A functional (declarative) programming language, with Lisp-like syntax. It was the first high-level language for Ethereum smart contracts but is rarely used today.
  • Serpent: A procedural (imperative) programming language with a syntax similar to Python. Can also be used to write functional (declarative) code, though it is not entirely free of side effects.
  • Solidity: A procedural (imperative) programming language with a syntax similar to JavaScript, C++, or Java. The most popular and frequently used language for Ethereum smart contracts. (We will be using this one)
  • Vyper: A more recently developed language, similar to Serpent and again with Python-like syntax. Intended to get closer to a pure-functional Python-like language than Serpent, but not to replace Serpent.
  • Bamboo: A newly developed language, influenced by Erlang, with explicit state transitions and without iterative flows (loops). Intended to reduce side effects and increase audibility. Very new and yet to be widely adopted.

Non-fungible tokens (NFT)

NFTs are tokens that we can use to represent ownership of unique items. They let us tokenise things like art, collectables, even real estate. They can only have one official owner at a time and they’re secured by the Ethereum blockchain – no one can modify the record of ownership or copy/paste a new NFT into existence.

NFT stands for non-fungible token. Non-fungible is an economic term that you could use to describe things like your furniture, a song file, or your computer. These things are not interchangeable with other items because they have unique properties.

Fungible items, on the other hand, can be exchanged because their value defines them rather than their unique properties. For example, ETH or dollars are fungible because 1 ETH / 1 USD is exchangeable for another 1 ETH / 1 USD.

Comparison

An NFT internetThe internet today
NFTs are digitally unique, no two NFTs are the same.A copy of a file, like a .mp3 or .jpg, is the same as the original.
Every NFT must have an owner and this is of public record and easy for anyone to verify.Ownership records of digital items are stored on servers controlled by institutions – you must take their word for it.
NFTs are compatible with anything built using Ethereum. An NFT ticket for an event can be traded on every Ethereum marketplace, for an entirely different NFT. You could trade a piece of art for a ticket!Companies with digital items must build their own infrastructure. For example, an app that issues digital tickets for events would have to build its own ticket exchange.
Content creators can sell their work anywhere and can access a global market.Creators rely on the infrastructure and distribution of the platforms they use. These are often subject to terms of use and geographical restrictions.
Creators can retain ownership rights over their own work, and claim resale royalties directly.Platforms, such as music streaming services, retain the majority of profits from sales.
Items can be used in surprising ways. For example, you can use digital artwork as collateral in a decentralised loan.

Solidity

Solidity is a contract-oriented, high-level language for implementing smart contracts. It was influenced by C++, Python, and JavaScript and is designed to target the Ethereum Virtual Machine (EVM).

Solidity is statically typed, supports inheritance, libraries, and complex user-defined types among other features.

Let’s see an example. Let’s build a simple token contract:

pragma solidity ^0.4.0;

contract SimpleToken {
    int64 constant TOTAL_UNITS = 100000;
    int64 outstanding_tokens;
    address owner;
    mapping(address => int64) holdings;

    function SimpleToken() public { // Constructor
        outstanding_tokens = TOTAL_UNITS;
        owner = msg.sender; // msg.sender represents the address that initiated this contract call
    }

    // Declaring some events
    event TokenAllocation(address holder, int64 number, int64 remaining);
    event TokenMovement(address from, address to, address value);
    event InvalidTokenUsage(string reason);

    function getOwner() public constant returns (address) {
        return owner;
    }

    // Allocate tokens
    function allocate(address newHolder, int64 value) public {
        if (msg.sender != owner) {
            InvalidTokenUsage('Only owner can allocate tokens');
            return;
        }

        if (value < 0) {
            InvalidTokenUsage('Cannot allocate negative value');
        }

        if (value <= outstanding_tokens) {
            holdings[newHolder] += value;
            outstanding_tokens -= value;
            TokenAllocation(newHolder, value, outstanding_tokens);
        } else {
            InvalidTokenUsage('Value to allocate longer that outstanding tokens');
        }
    }

    // Move tokens
    function move(address destination, int64 value) public {
        address source = msg.sender;

        if (value < 0) {
            InvalidTokenUsage('Must move value greater than zero');
        }

        if (holdings[source] >= value) {
            holdings[destination] += value;
            holdings[source] -= value;
            TokenMovement(source, destination, value);
        } else {
            InvalidTokenUsage('Value to move longer than holdings');
        }
    }

    // Getters & fallback
    function myBalance() constant public returns (int64) {
        return holdings[msg.sender];
    }

    function holderBalance(address holder) constant public returns (int64) {
        if (msg.sender != owner) {
            return;
        }

        return holdings[holder];
    }

    function outstandingBalance() constant public returns (int64) {
        if (msg.sender != owner) {
            return;
        }

        return outstanding_tokens;
    }

    function() public {
        revert();
    }
}

For reasons related to the environment where I am working, it is a bit restricted, the contract does not follow the latest style described on the documentation of the most recent version but, as an example, it should work. You can find the latest version here.

Building stuff

Setting up the environment

Dependencies required:

  • nodejs
  • Truffle framework: Framework to create Ethereum smart contracts
  • Ganache: Quickly fire up a personal Ethereum blockchain which you can use to run tests, execute commands, and inspect state while controlling how the chain operates.
  • Metamask: Configure out fake ether addresses to interact with the apps we are building. It is a Chrome Extension.

Installing dependencies

# Installing node
$ sudo apt install nodejs
$ node -v
 
 
# Install Truffle
$ sudo npm install -g truffle
 
 
# Installing Ganache
$ wget https://github.com/trufflesuite/ganache/releases/download/v2.5.4/ganache-2.5.4-linux-x86_64.AppImage
$ chmod a+x ganache-2.5.4-linux-x86_64.AppImage
$ ./ganache-2.5.4-linux-x86_64.AppImage
 
 
# Installing Solidity compiler
$ sudo add-apt-repository ppa:ethereum/ethereum
$ sudo apt update
$ sudo apt install solc

Project 1: Memory Game with Blockchain

Memory Game, also known as the Concentration card game or Matching Game, is a simple card game where you need to match pairs by turning over 2 cards at a time. Once this match has been done, we can keep the card forever adding it to the blockchain.

Elements involved:

  • Smart contract
  • NFT

Source code can be found here.

Project 2: Decentralised Twitter

Just a very basic decentralized Twitter.

Elements involved:

  • Drizzle: A collection of front-end libraries that make writing Dapp user interfaces easier and more predictable.
  • Smart contract

Source code can be found here.

Notes

Ether, Gas, Gas Cost, Fees

  • Ether – the cryptocurrency underpinning Ethereum.
  • Gas – the unit used to measure the execution of your transaction.
  • Gas Cost – the price of one “gas unit” that you are prepared to pay.
  • Set the higher gas costs to get faster confirmation.
  • Fee – the (gas * gasCost) cost you pay to run your transaction.

Tools

  • Ethereum nodes
    • Geth: It is an Ethereum-client, which means that we can run our own private blockchain with it. Command-line
    • parity: Parity Ethereum is a software stack that lets you run blockchains based on the Ethereum Virtual Machine (EVM) with several different consensus engines.
    • Ganache: It allows you to create your own private blockchain mainly for testing purposes It has UI
  • Cloud environments
    • Infura.io: Infura’s development suite provides instant, scalable API access to the Ethereum and IPFS networks.
    • Microsoft Azure
  • IDEs
    • Normal IDEs. IntelliJ have a plugin for Solidity
    • Javascript editors are good for building the tests and any app
  • Dev environment
    • Web3j: Web3j is a library for working with Smart Contracts and integrating with Ethereum blockchains. This allows you to work with Ethereum blockchains, without the additional overhead of having to write your own integration code for the platform.
    • Embark: The all-in-one developer platform for building and deploying decentralized applications
    • Truffle: Framework to create Ethereum smart contracts
    • Brownie: Brownie is a Python-based development and testing framework for smart contracts targeting the EVM.
  • Tools
    • Etherchain: makes the Ethereum blockchain accessible to non-technical end-users.
    • remix: Remix IDE is an open-source web and desktop application. It fosters a fast development cycle and has a rich set of plugins with intuitive GUIs. Remix is used for the entire journey of contract development as well as being a playground for learning and teaching Ethereum.
    • etherscan: Etherscan is a Block Explorer and Analytics Platform for Ethereum, a decentralized smart contracts platform
    • EthGasStation: ETH Gas Station aims to increase the transparency of gas prices, transaction confirmation times, and miner policies on the Ethereum network.
    • Metamask: MetaMask is a software cryptocurrency wallet used to interact with the Ethereum blockchain. It allows users to access their Ethereum wallets through a browser extension or mobile app.

Blockchain for Development

TypeTool
EmulatorsGanache, Embark
Lightweight nodesEthereumjs-vm, Pyethereum
Local Regular BlockchainsGeth, Parity
Hosted Nodes or ChainsInfura, Azure
Public Testing BlockchainsRinkeby, Ropsten
Public BlockchainMainnet

References

What Is Blockchain?

Blockchain explained

Blockchain technology defined

Mastering Ethereum (book)

Ethereum Whitepaper

What is Ethereum and How does it work?

Understanding Ethereum

Intro to Ethereum programming (video)

Blockchain, Ethereum and Smart Contracts

Maintaining compatibility

Most of the companies nowadays are implementing or want to implement architectures based on micro-services. While this can help companies to overcome multiple challenges, it can bring its own new challenges to the table.

In this article, we are going to discuss a very concrete one been maintaining compatibility when we change objects that help us to communicate the different micro-services on our systems. Sometimes, they are called API objects, Data Transfer Objects (DTO) or similar. And, more concretely, we are going to be using Jackson and JSON as a serialisation (marshalling and unmarshalling) mechanism.

There are some other methods, other technologies and other ways to achieve this but, this is just one tool to keep on our belt and be aware of to make informed decisions when faced with this challenge in the future.

Maintaining compatibility is a very broad term, to establish what we are talking about and ensure we are on the same page, let us see a few examples of real situations we are trying to mitigate:

  • To deploy breaking changes when releasing new features or services due to changes on the objects used to communicate the different services. Especially, if at deployment time, there is a small period of time where the old and the new versions are still running (almost impossible to avoid unless you stop both services, deploy them and restart them again).
  • To be forced to have a strict order of deployment for our different services. We should be able to deploy in any order and whenever it best suits the business and the different teams involved.
  • The need of, due to a change in one object in one concrete service, being forced to deploy multiple services not directly involved or affected by the change.
  • Related to the previous point, to be forced to change other services because of a small data or structural change. An example of this would be some objects that travel through different systems been, for example, enriched with extra information and finally shown to the user on the last one.

To exemplify the kind of situation we can find ourselves in, let us take a look at the image below. In this scenario, we have four different services:

  • Service A: It stores some basic user information such as the first name and last name of a user.
  • Service B: It enriches the user information with extra information about the job position of the user.
  • Service C: It adds some extra administrative information such as the number of complaints open against the user.
  • Service D: It finally uses all the information about the user to, for example, calculate some advice based on performance and area of work.

All of this is deployed and working on our production environment using the first version of our User object.

At some point, product managers decided the age field should be considered on the calculations to be able to offer users extra advice based on proximity of retirement. This added requirement is going to create a second version of our User object where the field age is present.

Just a last comment, for simplicity purposes, let us say the communication between services is asynchronous based on queues.

As we can see on the image, in this situation only services A and D should be modified and deployed. This is what we are trying to achieve and what I mean by maintaining compatibility. But, first, let us explore what are the options we have at this point:

  1. Upgrade all services to the second version of the object User before we start sending messages.
  2. Avoid sending the User from service A to service D, send just an id, and perform a call from service D to recover the User information based on the id.
  3. Keep the unknown fields on an object even, if the service processing the message at this point does not know anything about them.
  4. Fail the message, and store it for re-processing until we perform the upgrade to all services involved. This option is not valid on synchronous communications.

Option 1

As we have described, it implies the update of the dependency service-a-user in all the projects. This is possible but it brings quickly some problems to the table:

  • We not only need to update direct dependencies but indirect dependencies too what it can be hard to track, and easy to miss. In addition, a decision needs to be done about what to do when a dependency is missed, should an error be thrown? Should we fail silently?
  • We have a problem with scenarios where we need to roll back a deployment due to something going wrong. Should we roll back everything? Good luck! Should we try to fix the problem while our system is not behaving properly?
  • Heavy refactoring operations or modifications can make upgrades very hard to perform.

Option 2

Instead of sending the object information on the message, we just send an id to be able posteriorly to recover the object information using a REST call. This option while very useful in multiple cases is not exempt from problems:

  • What if, instead of just a couple of enrichers, we have a dozen of them and they need the user information? Should we consolidate all services and send ids for the enriched information crating stores on the enrichers?
  • If, instead of a queue, other mechanisms of communications are used such as RPC, do now all the services need to call service A to recover the User information and do their job? This just creates a cascade of calls.
  • And, under this scenario, we can have inconsistent data if there is any update while the different services are recovering a User.

Option 3

This is going to be the desired option and the one we are going to do a deep dive on this article using Jackson and JSON how to keep the fields even if the processing service does not know everything about them.

To add in advance that, as always, there are no silver bullets, there are problems that not even this solution can solve but it will mitigate most of the ones we have named on previous lines.

One problem we are not able to solve with this approach – especially if your company perform “all at once” releases instead of independent ones – is, if service B, once deployed tries to persist some information on service A before the new version has been deployed, or tries to perform a search using one criterion, in this case, the field age, on the service A. In this scenario, the only thing we can do is to throw an error.

Option 4

This option, especially in asynchronous situations where messages can be stored to be retried later, can be a possible solution to propagate the upgrade. It will slow down our processing capabilities temporarily, and retrying mechanism needs to be in place but, it is doable.

Using Jackson to solve versioning

Renaming a field

Plain and simple, do not do it. Especially, if it is a client-facing API and not an internal one. It will save you a lot of trouble and headaches. Unfortunately, if we are persisting JSON on our databases, this will require some migrations.

If it needs to be done, think about it again. Really, rethink it. If after rethinking it, it needs to be done a few steps need to be taken:

  1. Update the API object with the new field name using @JsonAlias.
  2. Release and update everything using the renamed field, and @JsonAlias for the old field name.
  3. Remove @JsonAlias for the old field name. This is a cleanup step, everything should work after step two.

Removing a field

Like in the previous case, do not do it, or think very hard about it before you do it. Again, if you finally must, a few steps need to be followed.

First, consider to deprecate the field:

If it must be removed:

  1. Explicitly ignore the old property with @JsonIgnoreProperties.
  2. Remove @JsonIgnoreProperties for the old field name.

Unknown fields (adding a field)

Ignoring them

The first option is the simplest one, we do not care for new fields, a rare situation but it can happen. We should just ignore them:

A note of caution in this scenario is that we need tone completely sure we want to ignore all properties. As an example, we can miss on APIs that return errors as HTTP 200 OK, and map the errors on the response if we are not aware of that, while in other circumstances it will just crash making us aware.

Ignoring enums

In a similar way, we can ignore fields, we can ignore enums, or more appropriately, we can map them to an UNKNOWN value.

Keeping them

The most common situation is that we want to keep the fields even if they do not mean anything for the service it is currently processing the object because they will be needed up or downs the stream.

Jackson offers us two interesting annotations:

  • @JsonAnySetter
  • @JsonAnyGetter

These two annotations help us to read and write fields even if I do not know what they are.

class User {
    @JsonAnySetter
    private final Map<String, Object> unknownFields = new LinkedHashMap<>();
    
    private Long id;
    private String firstname;
    private String lastname;

    @JsonAnyGetter
    public Map<String, Object> getUnknownFields() {
        return unknownFields;
    }
}

Keeping enums

In a similar way, we are keeping the fields, we can keep the enums. The best way to achieve that is to map them as strings but leave the getters and setters as the enums.

@JsonAutoDetect(
    fieldVisibility = Visibility.ANY,
    getterVisibility = Visibility.NONE,
    setterVisibility = Visibility.NONE)
class Process {
    private Long id;
    private String state;

    public void setState(State state) {
        this.state = nameOrNull(state);
    }

    public State getState() {
        return nameOrDefault(State.class, state, State.UNKNOWN);
    }

    public String getStateRaw() {
        return state;
    }
}

enum State {
    READY,
    IN_PROGRESS,
    COMPLETED,
    UNKNOWN
}

Worth pointing that the annotation @JsonAutoDetect tells Jackson to ignore the getters and setter and perform the serialisation based on the properties defined.

Unknown types

One of the things Jackson can manage is polymorphism but this implies we need to deal sometimes with unknown types. We have a few options for this:

Error when unknown type

We prepare Jackson to read an deal with known types but it will throw an error when an unknown type is given, been this the default behaviour:

@JsonTypeInfo(
    use = JsonTypeInfo.Id.NAME,
    include = JsonTypeInfo.As.PROPERTY)
@JsonSubTypes({
    @JsonSubTypes.Type(value = SelectionProcess.class, name = "SELECTION_PROCESS"),
})
interface Process {
}

Keeping the new type

In a very similar to what we have done for fields, Jackson allow as to define a default or fallback type when the given type is not found, what put together with out unknown fields previous implementation can solve our problem.

@JsonTypeInfo(
    use = JsonTypeInfo.Id.NAME,
    include = JsonTypeInfo.As.PROPERTY,
    property = "@type",
    defaultImpl = AnyProcess.class)
@JsonSubTypes({
    @JsonSubTypes.Type(value = SelectionProcess.class, name = "SELECTION_PROCESS"),
    @JsonSubTypes.Type(value = SelectionProcess.class, name = "VALIDATION_PROCESS"),
})
interface Process {
    String getType();
}

class AnyProcess implements Process {
    @JsonAnysetter
    private final Map<String, Object> unknownFields = new LinkedHashMap<>();

    @JsonProperty("@type")
    private String type;

    @Override
    public String getType() {
        return type;
    }

    @JsonAnyGetter
    public Map<String, Object> getUnknownFields() {
        return unknownFields
    }
}

And, with all of this, we have decent compatibility implemented, all provided by the Jackson serialisation.

We can go one step further and implement some basic classes with the common code e.g., unknownFields, and make our API objects extend for simplicity, to avoid boilerplate code and use some good practices. Something similar to:

class Compatibility {
    ....
}

class MyApiObject extends Compatibility {
    ...
}

With this, we have a new tool under our belt we can consider and use whenever is necessary.

Maintaining compatibility

Domain impersonation techniques

A few days ago I was talking with one of my acquaintances and he told me he has recently been the victim of a phishing attack. Luckily, he realised just after introduce his data in a bogus form something was not right, he could contact his service provider company and fix the problem without too many headaches.

During the conversation he showed me the message he received with the link he clicked on and, I was quick to notice it was a fake message. It was just an SMS message, not enough text to check as it can be done in an email but, the link was obviously (to me) a fake one.

From other conversations I have had with him in the past, I know he has some basic knowledge and security awareness, especially because his company make them do some basic training about it, which is very good, but when discussing with him, I realised he had never heard about the term domain impersonation and, other than a basic comment about checking links before you click them he was never given any examples of what to look for.

Trying to put my grain of salt out there to raise awareness, we are going to review quickly a few of the most common techniques and try to learn a bit more by example.

Let’s say we have a service provider that offers its services through a web page hosted on telecomexample.org. This will be our starting address, the real and original one. Now, let’s see what kind of techniques we can apply to mislead and trick unfortunate users.

Omission

This technique consists into skip one character on the original address. For example, telecmexample.org. As we can see, the example is omitting an “o“. The longer the address is, the easier is to miss that.

Typosquad: Substitution

This technique consists of replacing a character with a similar one. For example, telecomexemple.org. As we can see, we are replacing the “a” with an “e“. Other common replacement are: “i -> 1” or “i -> l“.

Homoglyph

This technique consists of replacing a character with another similar-looking character from a different alphabet. Usually from Latin, Greek or Cyrillic. For example, teʟecomexample.org. In this case, we have replaced the “l“.

Addition

This technique consists of adding an extra character to the address. For example, tellecomexample.org. Reading it carefully, we can see an extra “l” has been added.

Transposition

This technique just alters the order of one or more characters on the address. For example, telecomxeample.org. In this case, we have swapped the “e” and the “x“.

Homophone

This technique uses similar-sounding words such as “narrows” and “narroughs“. Like, telecomsample.org. Where the word “example” has been replaced by the similar-sounding word “sample“. Note: Probably there are better examples, but given the address domain, and not being a native speaker, it is hard, feel free to comment with better suggestions.

Subdomain

The service provider domain is used as a subdomain for a domain owned by the attackers. Such as telecomexample.accounts.org. Where the attacker owns the domain “accounts.org“.

TLD swap

Using the service provider domain but with a different top-level domain. For example, telecomexample.com. Where the top-level domain COM is used instead of the real one ORG.

Hyphenation

Adding an intermediate hyphen to the domain. Like, telecom-example.org. Where a hyphen has been added between the words “telecom” and “example“.

Added keywords

In this case, an extra word is added to the original domain trogon to mislead users. For example, telecomexamplelogin.org. Where the word “login” has been added to the original domain.

Today, just a short article but, I hope it helps to raise some awareness about very common impersonation domain techniques used by attackers to deceive users.se

Domain impersonation techniques

JVM Deep Dive

What is the JVM?

The Java Virtual Machine (JVM) is a specification that provides a runtime environment in which Java code can be executed. It is the component of the technology responsible for its hardware and operating system independence. The JVM has two primary functions:

  • To allow Java programs to run on any device or operating system (“Write once, run anywhere” principle).
  • To manage and optimize program memory.

There are three important distinctions that need to be made when talking about the JMV:

  • JVM specification: The specification document describes an abstract machine, formally describes what is required in a JVM implementation, it does not describe any particular implementation of the Java Virtual Machine. Implementation details are left to the creativity of implementors. The specification document for version 16 can be found here.
  • JVM implementations: They are concrete implementations of the JVM specification. As it has been said before, it is up to the implementors how to develop and materialise the specification. This allows different implementations to focus on the improvement of different areas, prioritise different parts of the specification, or build non-standard implementations. Some reasons why to develop and implementation are:
    • Platform support: Run Java on a platform for which Oracle does not provide a JVM.
    • Resource usage: Tun Java on a device that does not have enough resources to run Oracle’s implementation.
    • Performance: Oracle’s implementation is not fast, scalable or predictable enough.
    • Licensing: Disagreement with Oracle’s licensing policy.
    • Competition: Offering an alternative.
    • Research or fun: Because, why not?
    • Some examples of implementations are Azu Zulu, Eclipse OpenJ9, Graals VM or, Hotspot.
  • JVM instances: It is a running implementation of the JVM.

JVM Architecture

The JVM consists of three distinct components:

  • Class Loader
  • Runtime Memory/Data Area
  • Execution Engine

Class Loader

Class loaders are responsible for loading dynamically Java classes into the data areas during runtime to the JVM. There are three phases in the class loading process: loading, linking, and initialization.

Loading

Loading involves taking the binary representation (bytecode) of a class or interface with a particular name, and generating the original class or interface from that. There are three built-in class loaders available in Java:

  • Bootstrap Class Loader: It loads the standard Java packages like java.lang, java.net, java.util, and so on. These packages are present inside the rt.jar file and other core libraries present in the $JAVA_HOME/jre/lib directory.
  • Extension Class Loader: The extension class loader is a child of the bootstrap class loader and takes care of loading the extensions of the standard core Java classes so that it is available to all applications running on the platform. Extensions are present in the $JAVA_HOME/jre/lib/ext directory.
  • Application Class Loader: It loads the files present on the classpath. By default, the classpath is set to the current directory of the application but, it can be modified.

Loading classes follow a hierarchical pattern, if a parent class loader (Bootstrap -> Extension -> Application) is unable to find a class, it delegates the work to a child class loader. If the last child class loader is not able to load the class either, it throws NoClassDefFoundError or ClassNotFoundException.

Linking

After a class is loaded into memory, it undergoes the linking process. Linking a class or interface involves combining the different elements and dependencies of the program together. Linking includes the following steps:

  • Verification: This phase checks the structural correctness of the .class file by checking it against a set of constraints or rules. If verification fails for some reason, we get a VerifyException. For example, if it has been compiled for a different version of Java.
  • Preparation: In this phase, the JVM allocates memory for the static fields of a class or interface, and initializes them with default values.
  • Resolution: In this phase, symbolic references are replaced with direct references present in the runtime constant pool.

Initialisation

Initialisation involves executing the initialisation method of the class or interface. This can include calling the class’s constructor, executing the static block, and assigning values to all the static variables. This is the final stage of class loading.

Runtime Data Area

Method Area

The Runtime Data Area is divided into five major components:

All the class level data such as the run-time constant pool, field, and method data, and the code for methods and constructors, are stored here. If the memory available in the method area is not sufficient for the program startup, the JVM throws an OutOfMemoryError.

Important to point the method area is created on the virtual machine start-up, and there is only one method area per JVM.

Heap Area

All the objects and their corresponding instance variables are stored here. This is the run-time data area from which memory for all class instances and arrays is allocated.

Again, important to point the heap is created on the virtual machine start-up, and there is only one heap area per JVM.

Stack Area

Whenever a new thread is created in the JVM, a separate runtime stack is also created at the same time. All local variables, method calls, and partial results are stored in the stack area. If the processing been done in a thread requires a larger stack size than what’s available, the JVM throws a StackOverflowError.

For every method call, one entry is made in the stack memory which is called the Stack Frame. When the method call is complete, the Stack Frame is destroyed.

The Stack Frame is divided into three sub-parts:

  • Local Variables: Each frame contains an array of variables known as its local variables. All local variables and their values are stored here. The length of this array is determined at compile-time.
  • Operand Stack: Each frame contains a last-in-first-out (LIFO) stack known as its operand stack. This acts as a runtime workspace to perform any intermediate operations. The maximum depth of this stack is determined at compile-time.
  • Frame Data: All symbols corresponding to the method are stored here. This also stores the catch block information in case of exceptions.

Program Counter (PC) Registers

The JVM supports multiple threads at the same time. Each thread has its own PC Register to hold the address of the currently executing JVM instruction. Once the instruction is executed, the PC register is updated with the next instruction.

Native Method Stacks

The JVM contains stacks that support native methods. These methods are written in a language other than Java, such as C and C++. For every new thread, a separate native method stack is also allocated.

Execution Engine

Once the bytecode has been loaded into the main memory, and details are available in the runtime data area, the next step is to run the program. The Execution Engine handles this by executing the code present in each class.

However, before executing the program, the bytecode needs to be converted into machine language instructions. The JVM can use an interpreter or a JIT compiler for the execution engine.

The Execution Engine has three main components:

Interpreter

The interpreter reads and executes the bytecode instructions line by line, due to this line by line execution, the interpreter is comparatively slower. In addition, if a method is called multiple times, every time a new interpretation is required.

JIT Compiler

The JIT Compiler neutralizes the disadvantage of the interpreter. The Execution Engine will be using the help of the interpreter in converting byte code, but when it finds repeated code it uses the JIT compiler, which compiles the entire bytecode and changes it to native code. This native code will be used directly for repeated method calls, which improve the performance of the system. The JIT Compiler has the following components:

  • Intermediate Code Generator: Generates intermediate code.
  • Code Optimizer: Optimizes the intermediate code for better performance.
  • Target Code Generator: Converts intermediate code to native machine code.
  • Profiler: Finds the hotspots (code that is executed repeatedly)

Garbage Collector

The Garbage Collector (GC) collects and removes unreferenced objects from the heap area. It is the process of reclaiming the runtime unused memory automatically by destroying them. Garbage collection makes Java memory-efficient because it frees space for new objects.

Garbage Collections is done automatically by the JVM at regular intervals and does not need to be handled separately but, It can also be triggered by calling System.gc() with the execution not guaranteed.

It involves two phases:

  • Mark: In this step, the GC identifies the unused objects in memory.
  • Sweep: In this step, the GC removes the objects identified during the previous phase.

The JVM contains 3 different types of garbage collectors:

  • Serial GC: This is the simplest implementation of GC, and is designed for small applications running on single-threaded environments. It uses a single thread for garbage collection. When it runs, it leads to a stop-the-world event where the entire application is paused.
  • Parallel GC: This is the default implementation of GC in the JVM, and is also known as Throughput Collector. It uses multiple threads for garbage collection but still pauses the application when running.
  • Garbage First (G1) GC: G1GC was designed for multi-threaded applications that have a large heap size available (more than 4GB). It partitions the heap into a set of equal size regions and uses multiple threads to scan them. G1GC identifies the regions with the most garbage and performs garbage collection on that region first.
  • Concurrent Mark Sweep (CMS) GC: Deprecated on Java 9 and removed on Java 14.

Java Native Interface (JNI)

JNI acts as a bridge for permitting the supporting packages for other programming languages such as C, C++, and so on. This is especially helpful in cases where you need to write code that is not entirely supported by Java, like some platform-specific features that can only be written in C.

Native Method Libraries

Native Method Libraries are libraries that are written in other programming languages, such as C, C++, and assembly. These libraries are usually present in the form of .dll or .so files. These native libraries can be loaded through JNI.

JVM memory structure

As exposed earlier, JVM manages the memory automatically with the help of the garbage collector. Memory management is the process of the allocation & de-allocation of the objects from a memory.

We have already described the main memory areas on the “Runtime Data Area” but, let’s explore a bit more how they work.

Heap Area

Heap space in Java is used for dynamic memory allocation for Java objects and JRE classes at the runtime. New objects are always created in heap space and the references to these objects are stored in stack memory. These objects have global access and can be accessed from anywhere in the application. This memory model is further broken into smaller parts called generations, these are:

  • Young Generation: This is where all new objects are allocated and aged. A minor Garbage collection occurs when this fills up.
    • Eden space: It is a part of the Young Generation space. When we create an object, the JVM allocates memory from this space.
    • Survivor space: It is also a part of the Young Generation space. Survivor space contains existing objects which have survived the minor GC phases of GC. There a Survivor Space 0 and Survivor Space 1.
  • Old or Tenured Generation: This is where long surviving objects are stored. When objects are stored in the Young Generation, a threshold for the object’s age is set and when that threshold is reached, the object is moved to the old generation.
  • Permanent Generation: This consists of JVM metadata for the runtime classes and application methods.

This area works as follows:

  1. When an object is created, it first allocated to Eden space because this is not that big and gets full quite fast. The garbage collector runs on the Eden space and clears all non-reference object.
  2. When the GC runs, it moves all objects surviving the garbage collecting process into the Survivor space 0. And, if they still survive, object in Survivor Space 0 into Survivor space 1.
  3. If an object survives for X rounds of the garbage collector (X depends on the JVM implementation), it is most likely that it will survive forever, and it gets moved into the Old space.

Metaspace (PermGen)

Metaspace is a new memory space starting from the Java 8 version; it has replaced the older PermGen memory space.

Metaspace is a special heap space separated from the main memory heap. The JVM keeps track of loaded class metadata in the Metaspace. Additionally, the JVM stores all the static content in this memory section. This includes all the static methods, primitive variables, and references to the static objects. It also contains data about bytecode, names, and JIT information.

Class metadata are the runtime representation of java classes within a JVM process, basically any information the JVM needs to work with a Java class. That includes, but is not limited to, runtime representation of data from the JVM class file format.

Metaspace is only released when a GC did run and unload class loaders.

Performance Enhancements

On the Oracle documentation, some performance enhancements can be found. These enhancements are:

  • Compact Strings
  • Tiered Compilation
  • Compressed Ordinary Object Pointer
  • Zero-Based Compressed Ordinary Object Pointers
  • Escape Analysis

For details on the enhancements better check the documentation where there is a solid explanation of these topics.

Tuning the Garbage Collector

Tuning should be the last option we use for increasing the throughput of the application and only when we see a drop in performance because of longer GC causing application timeouts.

Java provides a lot of memory switches that we can use to set the memory sizes and their ratios. Some of the commonly used memory switches are:

-Xms For setting the initial heap size when JVM starts
-Xmx For setting the maximum heap size
-Xmn For setting the size of the young generation (rest is old generation)
-XX:PermGen For setting the initial size of the Permanent Generation Memory
-XX:MaxPermGen For setting the maximum size of Perm Gen
-XX:SurvivorRatio For providing a ratio of Eden space
-XX:NewRatio For providing a ratio of old/new generation sizes. The default value is 2
-XX:+UserSerialGC For enable Serial garbage collector
-XX:+UseParallelGC For enable Parallel garbage collector
-XX:+UseConcmarkSweepGC For enable CMS garbage collector
-XX:+ParallelCMSThreads For enabling CMS Collector as number of threads to use
-XX:+UseG1GC For enable G1 garbage collector
-XX:HeapDumpOnOutOfMemory Pass a parameter to create a heap dump file when this error happens next time.

Tools

After all this explanation and, in addition to all the configurations, a very interesting point is the monitorization of our Java memory. To do this we have multiple tools we can use:

jstat

It is a utility that provides information about the performance and resource consumption of running java applications. We can use the command with the garbage collection option to obtain various information from a java process.

S0C – Current survivor space 0 capacity (KB)
S1C – Current survivor space 1 capacity (KB)
S0U – Survivor space 0 utilization (KB)
S1U – Survivor space 1 utilization (KB)
EC – Current eden space capacity (KB)
EU – Eden space utilization (KB)
OC – Current old space capacity (KB)
OU – Old space utilization (KB)
MC – Metasapce capacity (KB)
MU – Metaspace utilization (KB)
CCSC – Compressed class space capacity (KB)
CCSU – Compressed class space used (KB)
YGC – Number of young generation garbage collector events
YGCT – Young generation garbage collector time
FGC – Number of full GC events
FGCT – Full garbage collector time
GCT – Total garbage collector time

jmap

It is a utility to print the memory-related statistics for a running VM or core file. It is a utility for enhanced diagnostics and reduced performance overhead.

jcmd

It is a utility used to send diagnostic command requests to the JVM, where these requests are useful for control, troubleshoot, and diagnose JVM and Java applications. It must be used on the same machine where the JVM is running, and have the same effective user and group identifiers that were used to launch the JVM.

jhat

It is a utility that provides a convenient browser base easy to use Heap Analysis Tool (HAT). The tool parses a heap dump in binary format (e.g., a heap dump produced by jcmd).

This is slightly different from other tools because when we execute it, it creates a webserver we can access from our browser to read the results.

VisualVM

VisualVM allows us to get detailed information about Java applications while they are running on a JVM and it can be in a local or a remote system also possible to save and capture the data about the JVM software and save data to the local system. VisualVM can do CPU sampling, memory sampling, run garbage collectors, analyze heap errors, take snapshots, and more.

JConsole

It is a graphical monitoring tool to monitor JVM and Java applications both on a local or remote machine. JConsole uses the underlying features of JVM to provide information on the performance and resource consumption of applications running on the Java platform using Java Management Extension (JMX) technology.

Memory Analyzer (MAT)

The Eclipse Memory Analyzer is a fast and feature-rich graphical Java heap analyzer that helps you find memory leaks and reduce memory consumption.

References

JVM Deep Dive

Diagrams as Code

I imagine that everyone reading this blog should be, by now, familiar with the term Infrastructure as Code (IaC). If not because we have written a few articles on this blog, probably because it is a widely extended-term nowadays.

At the same time I assume familiarity with the IaC term, I have not heard a lot of people talking about a similar concept called Diagram as Code (DaC) but focus on diagrams. I am someone that, when I arrive at a new environment, finds diagrams very useful to have a general view of how a new system works or, when given explanations to a new joiner. But, at the same time, I must recognise, sometimes, I am not diligent enough to update them or, even worst, I arrived at projects where, or they are too old to make sense or there are none.

The reasons for that can be various, it can be hard for developers to maintain diagrams, lack of time, lack of knowledge on the system, not obvious location of the editable file, only obsolete diagrams available and so on.

I have been playing lately with the DaC concept and, with a little bit of effort, all new practices require some level of it till they are part of the workflow, it can fill this gap and help developers and, other profiles in general, to keep diagrams up to date.

In addition, there are some extra benefits of using these tools such as easy version control, the need for only a text editor to modify the diagram and, the ability to generate the diagrams everywhere, even, as part of our pipelines.

This article is going to cover some of the tools I have found and play with it lately, the goal is to have a brief introduction to the tools, be familiar with their capabilities and restrictions and, try to figure out if they can be introduced on our production environments as a long term practise. And, why not, compare which tool offers the most eye-catching ones, in the end, people are going to pay more attention to these ones.

Just a quick note before we start, all the tools we are going to see are open source tools and freely available. I am not going to explore any payment tool and, I am not involved in any way in the tools we are going to be using.

Graphviz

The first tool, actually a library, we are going to see is Graphviz. In advance, I am going to say, we are not exploring this option in depth because it is too low level for my taste or the purpose we are trying to achieve. It is a great solution if you want to generate diagrams as part of your applications. In fact, some of the higher-level solution we are going to explore use this library to be able to generate the diagrams. With that said, they define themselves as:

Graphviz is open source graph visualization software. Graph visualization is a way of representing structural information as diagrams of abstract graphs and networks. It has important applications in networking, bioinformatics, software engineering, database and web design, machine learning, and in visual interfaces for other technical domains.

The Graphviz layout programs take descriptions of graphs in a simple text language, and make diagrams in useful formats, such as images and SVG for web pages; PDF or Postscript for inclusion in other documents; or display in an interactive graph browser. Graphviz has many useful features for concrete diagrams, such as options for colors, fonts, tabular node layouts, line styles, hyperlinks, and custom shapes.

https://graphviz.org

The installation process in an Ubuntu machine is quite simple using the package manager:

sudo apt install graphviz

As I have said before, we are not going to explore deeper this library, we can check on the documentation for some examples built in C and, the use of the command line tools it offers but, for my taste, it is a too low-level solution to directly use it when implementing DaC practises.

PlantUML

The next tool is PlantUML. Regardless of what the name seems to imply, the tool is able to generate multiple types of diagrams, all of them listed on their web page. Some examples are:

The diagrams are defined using a simple and intuitive language and, images can be generated in PNG, SVG or, LaTeX format.

They provide an online tool it can be used for evaluation and testing purposes but, in this case, we are going to be installing it locally, especially to test how hard it is and how integrable is in our CI/CD pipelines as part of our evaluation. By the way, this is one of the tools it uses Graphviz behind the scenes.

There is no installation process, the tool just needs the download of the corresponding JAR file from their download page.

Now, let’s use it. We are going to generate a State Diagram. It is just one of the examples we can find on the tool’s page. The code used to generate the example diagram is the one that follows and is going to be stored in a TXT file:

@startuml
scale 600 width

[*] -> State1
State1 --> State2 : Succeeded
State1 --> [*] : Aborted
State2 --> State3 : Succeeded
State2 --> [*] : Aborted
state State3 {
  state "Accumulate Enough Data\nLong State Name" as long1
  long1 : Just a test
  [*] --> long1
  long1 --> long1 : New Data
  long1 --> ProcessData : Enough Data
}
State3 --> State3 : Failed
State3 --> [*] : Succeeded / Save Result
State3 --> [*] : Aborted

@enduml

Now, let’s generate the diagram:

java -jar ~/tools/plantuml.jar plantuml-state.txt

The result is going to be something like:

State Diagram generated with PlantUML

As we can see, the result is pretty good, especially if we consider that it has taken us around five minutes to write the code without knowing the syntax and we have not had to deal with positioning the element on a drag and drop screen.

Taking a look at the examples, the diagrams are not particularly beautiful but, the easy use of the tool and the variety of diagrams supported makes this tool a good candidate for further exploration.

WebSequenceDiagrams

WebSequenceDiagrams is just a web page that allows us to create in a quick and simple way Sequence Diagrams. It has some advantages such as offering multiple colours, there is no need to install anything and, having only one purpose, it covers it quite well in a simple way.

We are not going to explore this option further because it does not cover our needs, we want more variety of diagrams and, it does not seem integrable on our daily routines and CI/CD pipelines.

Asciidoctor Diagram

I assume everyone is more or less aware of the existence of the Asciidoctor project. The project is a fast, open-source text processor and publishing toolchain for converting AsciiDoc content to HTML5, DocBook, PDF, and other formats.

Asccidoctor Diagram is a set of Asciidoctor extensions that enable you to add diagrams, which you describe using plain text, to your AsciiDoc document.

The installation of the extension is quite simple, just a basic RubyGem that can be installed following the standard way.

gem install asciidoctor-diagram

There are other options of usage but, we are going to do an example using the terminal and, using the PlantUML syntax we have already seen.

[plantuml, Asciidoctor-classes, png]     
....
class BlockProcessor
class DiagramBlock
class DitaaBlock
class PlantUmlBlock

BlockProcessor <|-- DiagramBlock
DiagramBlock <|-- DitaaBlock
DiagramBlock <|-- PlantUmlBlock
....

The result been something like:

Generated with Asciidoctor Diagrams extension

One of the advantages of this tool is that it supports multiple diagram types. As we can see, we have used the PlantUML syntax but, there are many more available. Check the documentation.

Another of the advantages is it is based on Asciidoctor that is a very well known tool and, in addition to the image it generates an HTML page with extra content if desired. Seems worth it for further exploration.

Structurizr

I was going to skip this one because, despite having a free option, requires some subscription for determinate features and, besides, it does not seem as easy to integrate and use as other tools we are seeing.

Despite all of this, I thought it was worth it to mention it due to the demo page they offer where, with just some clicking, you can see the diagram expressed on different syntaxes such as PlantUML or WebSequenceDiagrams.

Diagrams

Diagrams is a tool that seems to have been implemented explicitly to follow the Diagram as Code practice focus on infrastructure. It allows you to write diagrams using Python and, in addition to support and having nice images for the main cloud providers, it allows you to fetch non-available images to use them in your diagrams.

Installation can be done using any of the available common mechanism in Python, in our case, pip3.

pip3 install diagrams

This is another one of the tools that, behind the scenes, uses Graphviz to do its job.

Let’s create our diagram now:

from diagrams import Cluster, Diagram
from diagrams.onprem.analytics import Spark
from diagrams.onprem.compute import Server
from diagrams.onprem.database import PostgreSQL
from diagrams.onprem.inmemory import Redis
from diagrams.onprem.aggregator import Fluentd
from diagrams.onprem.monitoring import Grafana, Prometheus
from diagrams.onprem.network import Nginx
from diagrams.onprem.queue import Kafka

with Diagram("Advanced Web Service with On-Premise", show=False):
    ingress = Nginx("ingress")

    metrics = Prometheus("metric")
    metrics << Grafana("monitoring")

    with Cluster("Service Cluster"):
        grpcsvc = [
            Server("grpc1"),
            Server("grpc2"),
            Server("grpc3")]

    with Cluster("Sessions HA"):
        master = Redis("session")
        master - Redis("replica") << metrics
        grpcsvc >> master

    with Cluster("Database HA"):
        master = PostgreSQL("users")
        master - PostgreSQL("slave") << metrics
        grpcsvc >> master

    aggregator = Fluentd("logging")
    aggregator >> Kafka("stream") >> Spark("analytics")

    ingress >> grpcsvc >> aggregator

And, let’s generate the diagram:

python3 diagrams-web-service.py

With that, the result is something like:

Diagram generated with Diagrams

As we can see, it is easy to understand and, the best part, it is quite eye-catching. And, everything looks in place without the need to mess with a drag and drop tool to position our elements.

Conclusion

As always, we need to evaluate which tool is the one that best fit our use case but, after seeing a few of them, my conclusions are:

  • If I need to generate infrastructure diagrams I will go with the Diagrams tools. Seems very easy to use been based on Python and, the results are very visually appealing.
  • For any other type of diagram, I will be inclined to use PlantUML. It seems to support a big deal of diagram types and, despite not being the most beautiful ones, it seems the results can be clear and useful enough.

Asciidoctor Diagrams seems a good option if your team or organisation is already using Asciidoctor and, it seems a good option if we want something else than just a diagram generated.

Diagrams as Code

Ubuntu Multipass

The ambit of IT, software development, operations or similar tends to be full of people that likes to try new trends or tools related directly with their day to day tasks or just out of curiosity. One quick way of doing this, it is to install all the tools and libraries in our machines and, after we have finished, try to clean everything or, at least, revert all the mistakes or, not very good practices we did when learning. Despite this been a valid way, overtime, our machines get polluted will lost dependencies, configuration files or libraries.

To avoid that, it seems a better way to try all the new stuff on an isolated environment and, if we like it and we decide do use it in our daily environments, to install it from scratch again probably correcting some initial mistakes or avoiding some bad practices.

There are plenty of solutions out there to achieve this and, to have an easy to set up throw-away environment. Most of them based on virtual machines or some kind of virtualisation. More traditional ones such as VirtualBox or VMWare or, some based on management solutions for virtual machines such as Vagrant.

Today, I just want to bring to the table a different one I have been playing with lately and, I did not know a few months ago. I do not know how popular is it or how extended it is but, I think that knowing different options it is always a plus. The tool is called Multipass. And, as Ubuntu describes it, it is “Ubuntu VMs on demand for any workstation. Multipass can launch and run virtual machines and configure them with cloud-init like a public cloud. Prototype your cloud launches locally for free.”

I have found it very easy to use and, for the purposes of having trow-away isolated environments laying around, quite useful.

We are going to see the install process and, the basic execution of a few commands related with an instance.

Installation

Before we start applying the steps to install Multipass on our machines, there are a couple of requirement we need to consider. They are related with the platform is going to be used to virtualise the images. In new operative systems, no extra requirements are needed but, some old ones have them. Check on the official documentation.

For Linux:

sudo snap install multipass

For Windows:

Just download the installer from the web page and proceed with the suggested steps.

For MacOS:

MacOS offers us two different alternatives. One based on an installation file similar to Windows and, one based on a package manager solution like Homebrew. If installing using the installation file, just execute it and follow the suggested steps and, if installing using Homebrew just execute the appropriate command (here):

brew install --cask multipass

Once the installation is done, any other command executed should be the same in all three environments.

Just as a side note, there is the possibility of using VirtualBox as a virtualisation platform if we desire it but, this is completely optional and depends only on our preferences. I am not using it. The command to install it can be found below but, I encourage you to go to the official documentation on this specific point.

Now we have finished the installation, let’s create our first instance.

Creating and using an instance

Let’s check what images are available:

Show the result of 'multipass find'
‘find’ execution – List of available images

We can see there are multiple images available but, in this case, we are going to create an instance using the latest version (20.10). By default, if not image is specified, multipass uses the las LTS version.

It is worth it to mention that, by default, multipass assign some values to our instance in terms of CPU, disk, memory and, others.

default instance values
default values – multipass launch -h
Show the result of 'multipass launch'
‘launch’ execution – Creates a new instance

As we can see it is quite fast and, if we create a second image, it will be even faster.

We can execute a command inside the instance:

Show the result of 'multipass exec'
‘exec’ execution – Executes a command inside the instance

Or, we can just login into the instance:

Show the result of 'multipass shell'
‘shell’ execution – Login into the instance

From now on, we can just work with this instance as we please.

There are a few commands we can use to manage our instances such as instances running, available instances or information about running instances. All these command are available on the help menu.

Show the result of 'multipass --help'
‘help’ execution – List available commands

Mounting a shared folder

Multipass offers the possibility of mounting folders to share information between the host and the instance using the command mount.

Show the share a folder process
Sharing a folder between host and instance

Cleaning (Deleting an instance)

Finally, as we do not want to leave throw-away instances laying around, after we have finished working, we can remove it.

Shows the 'multipass delete' execution
‘delete’ execution – Removes an instance

This is just a brief introduction to multipass. More complex scenarios can be found on the official documentation.

Ubuntu Multipass

Container attack vectors

We live in a containerised world. Container solutions like Docker are now so extended that they are not a niche thing any more or a buzzword, they are mainstream. Multiple companies use it and, the ones that do not are dreaming with it probably.

The only problems are that they are still something new. The adoption of them has been fast and, it has arrived like a storm to all kind of industries that use technology. The problem is that from a security point of view we, as an industry, do not have all the awareness we should have. Containers and, especially, containers running on cloud environments are hidden partially the fact that they exist and they need to be part of our security considerations. Some companies use them thinking they are completely secure, trusting the cloud providers or the companies that generate the containers take care of everything and, even, for less technology focus business, they are an abstraction and not real and tangible thing. They are not the old bare metal servers, the desktop machines or the virtual machines they were used to it, and till a certain point, they worried because they were things that could be touched.

All of that has made that while security concerns for web applications are first-level citizens, not as much as it should but the situation has improved a lot on the last few years, security concerns about containers seem to be the black sheep of the family, no one talks about it. And, this is not right. It should have the same level of concern and the same attention should be paid to it and, be part of the development life cycle.

In the same way that web applications can be attacked in multiple ways, containers have their own attack vectors, some of which we are going to see here. We will see that some of the attack vectors can be easily compared with known attack vectors on spaces we are more aware like web applications.

Vulnerable application code

Containers package applications and third-party dependencies that can contain known flaws or vulnerabilities. There are thousands of published vulnerabilities that attackers can take advantage to exploit our systems if found on the applications running inside the containers.

The best to try to avoid running container with known vulnerabilities is to scan the images we are going to deploy and, not just as a one-time thing. This should be part of our delivery pipelines and, the scans should apply all the time. In addition to known vulnerabilities, scanners should try to find out-of-date packages that need an update. Even, some available scanners try to find some possible malware on the images.

Badly configured container images

When configuring how a container is going to be built some vulnerabilities can be introduced by mistake or if not the proper attention is paid to the building process that can be later exploited by attackers. A very common example is to configure the container to run with unnecessary root permissions giving it more privileges on the host than it really needs.

Build machine attacks

As any piece of software, the one we use to run CI/CD pipelines and build container images can be attacked successfully and, attackers can add malicious code to our containers during the build phase obtaining access to our production environment once the containers have been deploy and, even, utilising these compromised containers to pivot to other parts of our systems or networks.

Supply chain attacks

Once containers have been built they are stored in registries and retrieved or “pulled” when they are going to be run. Unfortunately, no one can guarantee the security of this registries and, an attacker can compromise the registry an replace the original image with a modified one including a few surprises.

Badly configured containers

When creating configuration files for our containers, i.e. a YAML file, we can make some mistakes and add configurations to the containers we did not need. Some possible examples are unnecessary access privileges or unnecessary open ports.

Vulnerable host

Containers run on host machines and, in the same way, we try to ensure containers are secure host should be too. Some times they run old versions of orchestration component with known vulnerabilities or other components for monitorisation. A good idea is to minimise the number of components installed on the host, configure them correctly and apply security best practices.

Exposed secrets

Credentials, tokens or passwords are all of them necessary if we want our system to be able to communicate with other parts of the system. One risk is the way we supply the container and the applications running in it these secret values. There are different approaches with varying levels of security that can be used to prevent any leakage.

Insecure networking

The same than non containerised applications, containers need to communicate using networks. some level of attention will be necessary to set up secure connections among components.

Container escape vulnerabilities

Containers are prepared to run on isolation from the hosts were they are running, in general, all container runtimes like “containerd” or “CRI-O” have been heavily tested and are quite reliable but, as always, there are vulnerabilities to be discovered. Some of these vulnerabilities can let malicious code running inside a container escape out into the host. Due to the severity of this, some stronger isolation mechanisms can be worth to consider.

Some other risks related to containers but not directly been containers can be:

  • Attacks to code repositories of application deployed on the containers poisoning them with malicious code.
  • Hosts accessible from the Internet should be protected as expected with other tools like firewalls, identity and access management systems, secure network configurations and others.
  • When container run under an orchestrator, i.e. Kubernetes, a door to new attack vectors is open. Configurations, permission or access not controlled properly can give attackers access to our systems.

As we can see some of the attack vectors are similar to the one existing in more mature areas like networking or web application but, due to the abstraction and the easy-to-use approach, the security on containers, unfortunately, is left out the considerations.

Reference: “Container Security by Liz Rice (O’Reilly). Copyright 2020 Vertical Shift Ltd., 978-1-492-05670-6”

Container attack vectors

PostgreSQL: Advisory Locks

Today, we are going to talk about PostgreSQL Advisory Locks. This kind of locks are created by the application and developers and, they have meaning inside the application, PostgreSQL does not enforce their use and they are there to fulfil a business or coding specific case. I was going to try to explain and to add some literature around them but, after reading PostgreSQL documentation (can be found here) I do not think it is necessary because the definition it is easy to understand and, besides, on the same page we can find the other types of locks available giving us some extra context. Instead, we are going to see some real-world code as an example.

Let’s say we have our shiny service that runs multiple instances at the same time on our production environment and, on that services, we run a scheduled task that updates one of our database tables adding a different sequence number to the existing rows (buildings) for all the existing cities. Something like:

id (uuid)city (text)building. (text)registered (timestamp)occurrence (bigint)
e6448a82LondonBritish Museum2021/02/01 13:00:00.000null
97347903LondonTower of London2021/02/01 12:59:59.999null
7befe492ParisEiffel Tower2021/01/31 07:23:34.294null
b426681aParisLouvre Museum2021/02/01 12:59:59.999null
156e1f89LondonBig Ben2021/02/01 12:59:59.999null
Table ‘buildings’

For some curious minds about the reason why we need this ‘occurrence‘ sequence, one of the cases can be to create an endpoint to allow other systems to synchronise these buildings. We could sort using the ‘registered‘ field but, it can happen that two buildings in the same city can be registered at the same time making it impossible to warrantee the information is going to be returned always on the same order, and this can cause synchronisation problems or even missing a building due to paginated requests. We want to be able to sort them in an immutable way.

Going back to the multiple services running the tasks, we can have some ugly situations were one of the tasks running is working already and, in the middle of updating a city, when another task in a different service start processing the same city, especially if we do this on batches due to the huge amount of data we store.

One simple solution of this is to use Advisory Locks allowing us, developers, to lock a city when the task is updating it. For this purpose, PostgreSQL offers us two nice functions to work with:

  • pg_advisory_lock: Obtains an exclusive session-level advisory lock, waiting if necessary.
  • pg_try_advisory_lock: Obtains an exclusive session-level advisory lock if available. This will either obtain the lock immediately and return ‘true‘, or return ‘false‘ without waiting if the lock cannot be acquired immediately.

The full list of system administration functions related with advisory locks can be found here.

For the purposes of the example code, we are going to implement, we will be using the second one because it makes sense if one city it is been processed, we do not want to process it again till the next scheduled time.

public void assignOccurrenceSequences() {
    final List<String> cities = buildingDao.retrievePendingCities();

    for (final String city : cities) {
        final int lockId = Math.abs(Hashing.sha256().newHasher()
            .putString(city, StandardCharsets.UTF_8)
            .hash().asInt());

        logger.info("Taking advisory_lock {} for city {} ", lockId, city);
        try (Connection connection = dataSource.getConnection()) {
            connection.setAutoCommit(true);

            final boolean lockObtained;
            try (Statement statement = connection.createStatement()) {
                lockObtained = statement.execute(format("select pg_try_advisory_lock(%d)", lockId));
            }

            if (lockObtained) {
                try {
                    final int updates = buildingDao.populateOccurrenceSequences(city);
                    logger.info("Assigning {} sequences for city {}", updates, city);
                } finally {
                    try (Statement statement = connection.createStatement()) {
                        statement.execute(format("select pg_advisory_unlock(%d)", lockId));
                    }

                    logger.info("Released advisory_lock {} for city {}", lockId, city);
                }
            } else {
                logger.info("advisory_lock {} for city {} already taken", lockId, city);
            }
        } catch (SQLException e) {
            throw new IllegalStateException(e);
        }
    }
}

On lines 5, 6 and 7 we create a unique lock id we will be using to establish the lock and make sure all the tasks running calculate the same id. And yes, before someone points it, we are assuming that ‘city‘ is unique. With this generated lock id, we can try to acquire the lock. In case of success, we proceed with the update. In case of fail, we skip that city and proceed with the rest of the cities.

PostgreSQL: Advisory Locks