Scalability Automated

Curiosio
3 min readMar 6, 2019

--

by Roman Bilusyak

To build ingenious optimization technology for future travel, we use multiple bare-metal servers with custom dev stack. Running the same workload on multiple compute nodes requires those nodes to have all necessary tools and libraries to be installed and configured.

Problems

Adding new dependency becomes an issue at scale. It takes orders of magnitude more efforts, it introduces errors, it transforms into maintenance hell. To eliminate the maintenance issues, we needed to unify environments on all nodes. Docker solves this problem well. Docker does it without big overhead, unlike popular virtualization technologies (ESXi or VirtulBox).

Another related problem is controled deployment and running of workload on each node. We used bash scrips to automate that step and it worked well on small scale [of several nodes]. We have to monitor workload and debug in case it does not work on some nodes as planned. Checking workload status required accessing each node over SSH and reading console logs.

One more problem with environment is compatibility with tools we need. Let me explain on real-life case. Almost all our nodes are running Ubuntu 14.04. One day we decided to use Google OR-Tools and got hard time installing it on Ubuntu 14.04. But there are no such installation issues on Ubuntu 18.04. Inside docker container we can use any version we like so used Ubuntu 18.04 as base image for that particular case.

Dockerization

Our workload became a docker container. Docker container is self-sufficient package, consisting of all required OS, tools, and libraries to run on any host, regardless of OS version or installed packages. All dependencies are defined in so called “dockerfile” scrips, used by Docker to build an image and run it later. We build and deploy Docker images in single click with Jenkins.

# Set the base image to Ubuntu 18.04 
FROM ubuntu:bionic
# Install required packages
RUN apt-get update && apt-get -y install python3 python3-pip
RUN pip3 install ortools
# Define working dir
WORKDIR /dist
# copy actual app scripts to the image
ADD ./dist /dist
# run entry point script
CMD ./run.sh

Next is configuring a cluster (set of nodes) — to automate deployment and management at scale. We were deciding between Kubernetes and Docker swarm. Kubernetes is built by Google and is very popular, but it seems to be not playing well with Ubuntu 14.04. While we are not ready to update all our servers to Ubuntu 18.04 yet. On other hand, swarm is native Docker cluster, it uses built-in API. It is very light and stable, it has no installation issues. Docker swarm cluster can be setup in minutes using just two steps:

  • define cluster by running single command on the designated cluster leader node docker swarm init
  • join other nodes into cluster by running single command docker swarm join on each host.

On top of Docker swarm we use Portainer. Portainer provides web based administration console to the cluster.

Before vs. After

By dockerizing our environment, we reduced manual work by 20x times by time efforts. And we also made it 10x less error prone. And this is only in the case when everything works fine, and there are no conflicts between lib versions or other environmental issues. Sometimes such issues can be very frustrating. Against bad cases, improvement is 100x or so, by time and errors.

--

--

No responses yet