The installation of Mesos with the Marathon and Chronos frameworks on a Bright Cluster is described in this article.
Upstream documentation for these applications can be found at:
-
Mesos documentation (http://mesos.apache.org/documentation/latest/)
-
Marathon documentation (https://mesosphere.github.io/marathon/docs/)
-
Chronos documentation (https://mesos.github.io/chronos/docs/)
The master node has two physical interfaces connected to it, for example:
-
one on an internal network, eth0
-
one on an external network, eth1.
The same is true for each compute node that is part of the Mesos cluster. Each compute node has two interfaces connected to it, for example:
-
one on the internal network, bootif
-
one on the external network, eth1
The DNS service that serves the external domain must contain a record A for each node (head node or compute nodes) pointing to the IP address of the interface connected on the external network (eth1).
To create a software image, we can start from a clean “default-image”:
cmsh softwareimage clone default-image mesos-image commit |
We can also create a new category called “mesos”. This category will contain some customization that is to be carried out later on this KB article. We set its software image to be the one created previously:
cmsh clone default mesos set softwareimage mesos-image commit |
Install the required rpms so that the Mesosphere repository can be set up:
rpm -Uvh http://repos.mesosphere.com/el/7/noarch/RPMS/mesosphere-el-repo-7-1.noarch.rpm |
Now install the required rpms on the head node and on the software image:
yum -y install mesos marathon mesosphere-zookeeper mod_ldap chronos |
Configure zookeeper:
echo 1 > /var/lib/zookeeper/myid cat >> /etc/zookeeper/conf/zoo.cfg << __EOF__ |
Configure the Mesos master, replacing <HEAD_NODE_FQDN> with the external fully-qualified domain name of the head node (eg. cluster1.brightcomputing.com). Also replace <CLUSTER_NAME> with the name of the Mesos cluster (eg. cluster1):
echo <HEAD_NODE_FQDN> > /etc/mesos-master/hostname echo <CLUSTER_NAME> > /etc/mesos-master/cluster echo 1 > /etc/mesos-master/quorum |
Configure the Mesos slave:
echo ports:[1-65000] > /cm/images/mesos-image/etc/mesos-slave/resources echo docker,mesos > /cm/images/mesos-image/etc/mesos-slave/containerizers |
Set up a finalize script to set the nodes to use the public FQDN instead of the private one. That is, replace <DOMAIN_NAME> with the public domain name managed by the external DNS, eg., in my case: brightcomputing.com.
echo 'echo "$CMD_HOSTNAME.<DOMAIN_NAME>" > /localdisk/etc/mesos-slave/hostname' > /tmp/mesos_hostname cmsh -c "category use mesos; set finalizescript /tmp/mesos_hostname; commit" |
Update the exclude list insert an entry for the “/etc/mesos-slave/hostname” file:
cmsh category use mesos set excludelistupdate |
The above command will open a text editor, append at the end of the file the following:
|
Configure the Marathon service:
cat > /etc/systemd/system/marathon.service << __EOF__ |
Configure the Chronos service:
cat > /etc/systemd/system/chronos.service << __EOF__ |
Reload systemd:
systemctl daemon-reload |
Configure the services for the slave nodes, disabling the Mesos master. Let CMDaemon manage the Mesos slave:
chroot /cm/images/mesos-image systemctl disable mesos-master exit |
Configure the services for the master node:
cmsh -c "device use master; services; add mesos-master; set autostart yes; set monitored yes; commit" cmsh -c "device use master; services; add chronos; set autostart yes; set monitored yes; commit" |
Configure Apache httpd:
cat > /etc/httpd/conf.d/marathon.conf << __EOF__ cat > /etc/httpd/conf.d/mesos.conf << __EOF__ cat > /etc/httpd/conf.d/chronos.conf << __EOF__ systemctl restart httpd |
Set the category “mesos” (for example using a foreach command) on the compute nodes and reboot the mesos nodes:
cmsh device foreach -n nodeXXX..nodeXXX (set category mesos) commit device reboot -c mesos |
Access the Marathon, Chronos and Mesos web applications using the following addresses (Replace <HEADNODE_PUB_FQDN> with the head node public FQDN). A basic authentication form will ask for a username and password.
http://<HEADNODE_PUB_FQDN>/marathon http://<HEADNODE_PUB_FQDN>/mesos http://<HEADNODE_PUB_FQDN>/chronos |
Create the “mesos” group with the following commands:
cmsh group add mesos commit |
A user can be created with the following commands (Replace <USERNAME> and <PASSWORD> with a valid username/password and set the mesos group membership:
cmsh user add <USERNAME> set <PASSWORD> commit group append mesos groupmembers <USERNAME> commit |
The following are JSON examples to test the deployment of the container using Marathon:
Bridged networking with an HTTP healthcheck exposing a random host port that maps to port 8000 in the container:
{ |
Bridged networking with an HTTP healthcheck exposing a fixed host port 8000 that maps to port 8000 in the container:
{ |
Host-based networking that exposes container port 9090 on the host:
{ |
It is possible to submit the task definitions to Marathon as in the above raw forms. But it is also possible to fill a form using a GUI, which is easier and more intuitive. A GUI is available from:
http://<HEADNODE_PUB_FQDN>/marathon
Some screenshots of the GUI in action:
Configure mesos DNS
Configure the Bind DNS on the head node to forward the request to resolv the “mesos” domain to the mesos DNS listening on port 8053:
cat >> /etc/named.conf.include << __EOF__ |
Install the mesos DNS:
curl -L https://github.com/mesosphere/mesos-dns/releases/download/v0.5.2/mesos-dns-v0.5.2-linux-amd64 -o /usr/bin/mesos-dns |
A mesos DNS configuration file is created:
cat > /etc/mesos/mesos-dns-config.json << __EOF__ |
A systemd unit file is created, and CMDaemon is configured with cmsh to manage the service:
cat > /etc/systemd/system/mesos-dns.service << __EOF__ systemctl daemon-reload |
Test the DNS:
dig leader.mesos dig _leader._tcp.mesos SRV |
The following example uses marathon, and has an application named “test” deployed using marathon:
{ "id": "test", "cmd": "python3 -m http.server", "cpus": 0.1, "mem": 128, "disk": 0, "instances": 1, "container": { "docker": { "image": "python:3", "network": "BRIDGE", "portMappings": [ { "containerPort": 8000, "protocol": "tcp", "name": null } ] }, "type": "DOCKER", "volumes": [] }, "env": {}, "labels": {}, "healthChecks": [ { "protocol": "HTTP", "path": "/", "portIndex": 0, "gracePeriodSeconds": 300, "intervalSeconds": 60, "timeoutSeconds": 20, "maxConsecutiveFailures": 3 } ] } |
Several records are added to the DNS automatically:
# dig test.marathon.mesos +short
# dig _test._tcp.marathon.mesos SRV +short
|