Nomad Multiple Environment Installation and setup on RHEL 8 running over AWS cloud.

Saumik Satapathy
AWS Tip
Published in
6 min readSep 18, 2022

--

What is Nomad?

  1. Nomad is a workload orchestrator similar to Kubernetes owned by Hashicorp.
  2. It is used to deploy and manage both containerised and non-containerised applications.
  3. It can be used on both VMs and Cloud Instances.
  4. It’s an alternative to Kubernetes.
  5. It’s easy to use and can be well integrated with Hashicorp products like Vault & Consul.
  6. It’s light weight approx. 115mb single binary file.
  7. It suppports both *nix & Windows as well.
  8. It supports federation means any server, any region & anywhere.

Benefits

Multi-cloud support

  • Can be used on multi-cloud environments. If Nomad establish a connection then it doesn’t care where the environment is.
  • Good Integration with Hashicorp products like Terraform, Consul & Vault.
  • Can be used by large organisation to very small organisation.

Nomad comes with two tracks of setup. One is Developer and another is Operation.

Developer

Developers need a small local environment for development, likely a server agent and a client agent on the same computer. It’s similar to Minikube and CRC in openshift.

Runs Nomad local using developer mode.

$ nomad agent -dev

The above command stimulates a small cluster consist of a server agent and a client agent.

Operations

  • Installing & running nomad in different servers.
  • Managing instances and submitting jobs.

It’s basically a full production grade setup of Nomad for DR.

Nomad Architecture

  1. Server Agents

A typical cluster should have minimum 3 to maximum 5 agents in server mode to manage the cluster. One node is always the leader and the rest are followers.

  1. Client Agents

Minimum of 3 agents to scalable to thousands of nodes. Practically, the more nodes, the longer it will take for new ones to join the cluster.

In this demo we’re going to use AWS CLOUD to host our Nomad agents.

If we keep our eye on Nomad official document on Nomad Reference Architecture then we will find that for a multi-node setup we need three server agents and three client agents.

Let’s do our hands dirty by spinning up 6 nodes of Red Hat Enterprise Linux 8 (HVM), SSD Volume Type of m5.large instance type.(For easy demo purpose we will use the default VPC & subnets with all port opened security group.)

We will spread the server instances in different availability zone for high availability.

We’ll login into each node. Update the packages and change the hostname for convince. Update the /etc/hosts files in all nodes(servers, agents) with hostname and private ip of the instance.

$ sudo yum update -y
$ sudo hostnamectl set-hostname nomad-server1.example.com
$ cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
################## Nomad #####################
172.31.24.161 nomad-server1.example.com
172.31.17.210 nomad-server2.example.com
172.31.31.16 nomad-server3.example.com
172.31.17.223 nomad-client1.example.com
172.31.16.52 nomad-client2.example.com
172.31.30.87 nomad-client2.example.com

Then we’ll install few tools which can be used in future such as vim, wget and unzip.

$ sudo yum install vim wget unzip -y

Then we’ll download the official repo of nomad and place in /etc/yum.repos.d/ directory.

$ sudo yum-config-manager --add-repo https://rpm.releases.hashicorp.com/RHEL/hashicorp.repo
$ cat /etc/yum.repos.d/hashicorp.repo

Now we will install Nomad . we’ll start and enable nomad service at boot.

$ sudo yum install nomad -y
$ nomad -version
$ cat /usr/lib/systemd/system/nomad.service
$ sudo systemctl start nomad
$ sudo systemctl enable nomad

Now we will make the SELinux is disabled. To do that we need to edit the SELinux config file.

$ sudo cat /etc/selinux/config

# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of these three values:
# targeted - Targeted processes are protected,
# minimum - Modification of targeted policy. Only selected processes are protected.
# mls - Multi Level Security protection.
SELINUXTYPE=targeted
$ sudo reboot

After updating the SELinux configuration file reboot the system.

Nomad reads all the configuration files present in /etc/nomad.d/ directory which ends with .hcl. We will edit the default configuration file and remove server and client block. We’ll create separate files for the server and client configurations.

$ cd /etc/nomad.d/
$ sudo vim nomad.hcl

Remove everything and keep only the below two lines. Do this in all 6 nodes.

# Full configuration options can be found at https://www.nomadproject.io/docs/configurationdata_dir  = "/opt/nomad/data"
datacenter = "pd1"

We’ll create a separate file for nomad server agents with the name server.hcl in the same directory.

$ sudo vim /etc/nomad.d/server.hcl

the content of the above file is,

server {
# license_path is required as of Nomad v1.1.1+
#license_path = "/opt/nomad/license.hclic"
enabled = true
bootstrap_expect = 3
}

which means it will expect three nodes as server agent.

Do the same steps in all the three server agents.

After the changes check the server nodes sucessfully connected to leader node or not.

$ nomad server members
[ec2-user@nomad-server1 ~]$ nomad server members
Name Address Port Status Leader Raft Version Build Datacenter Region
nomad-server1.example.com.global 172.31.24.161 4648 alive false 3 1.3.5 pd1 global
nomad-server2.example.com.global 172.31.17.210 4648 alive false 3 1.3.5 pd1 global
Error determining leaders: 1 error occurred:
* Region "global": Unexpected response code: 500 (No cluster leader)

If any nodes are not joined with the leader we need to join them manually. For that we need to login into that node and run the nomad join command.

$ sudo nomad server join nomad-server1.example.com
Joined 1 servers successfully

After all the server nodes joined it will automatically pic one leader node.

$ nomad server members
Name Address Port Status Leader Raft Version Build Datacenter Region
nomad-server1.example.com.global 172.31.24.161 4648 alive false 3 1.3.5 pd1 global
nomad-server2.example.com.global 172.31.17.210 4648 alive false 3 1.3.5 pd1 global
nomad-server3.example.com.global 172.31.31.16 4648 alive true 3 1.3.5 pd1 global

After the successful setup of Nomad Server nodes we will now configure the agent nodes. SSH into all the three client nodes. Change the hostname and update the yum repos. Reboot the system.

$ sudo hostnamectl set-hostname nomad-client1.example.com
$ sudo yum update -y
$ sudo reboot

Do the above steps for all the client nodes.

Install Nomad in all the client nodes.

$ sudo yum-config-manager --add-repo https://rpm.releases.hashicorp.com/RHEL/hashicorp.repo
$ cat /etc/yum.repos.d/hashicorp.repo
$ sudo yum install nomad -y
$ nomad -version
$ cat /usr/lib/systemd/system/nomad.service
$ sudo systemctl start nomad
$ sudo systemctl enable nomad

Install Docker in all the client nodes.

$ sudo yum install docker -y

Go to the /etc/nomad.d/ directory. Create the client.hcl file.

client {
enabled = true
servers = ["172.31.24.161","172.31.17.210","172.31.31.16"]
}

Update the 3 servers private ip address of the servers.

In the Server node run nomad node status command. If the nodes are not showing we need to add them manually.

Login to each agent node and run the below command to join the node.

$ nomad node config -update-servers nomad-server1.example.com

After adding all the agent nodes check the status.

$ nomad node status
ID DC Name Class Drain Eligibility Status
74f6b148 dc1 nomad-client3.example.com <none> false eligible ready
7cae1031 dc1 nomad-client2.example.com <none> false eligible ready
95bf4095 dc1 nomad-client1.example.com <none> false eligible ready

Now we have a full production ready Nomad setup running on AWS.

There is a dashboard service running on port 4646.Can be accessible through the public IP. In Production environment either we block port 4646 or restrict to our network.

Nomad GUI

--

--

A passionate software Engineer with good hands on experience in the field of DevOps/SRE. Love to share knowledge and intersted to learn from others.