Skip to content

Vagrant Puppet VM Clusters

View Project Source: https://github.com/jessecascio/vms

The need for local VM clusters stems from the fact that most modern web applications are built upon distributed networks i.e. multiple servers. In order to effectively develop modern applications it is essential to have the ability to quickly replicate server set ups in a local environment. The goal of this project was to streamline the process of setting up distributed development environments and share them amongst the development community. For doing this, I used Vagrant, a Virtual Machine management tool for development environment creation. Vagrant is quick to learn, easy to use, and can work with various different VM software. Read Vagrant Virtual Machine Cluster for more information on setting up a Vagrant environment.

After setting up various development environments, I realized that there was a lot or repeated work that went into getting the servers ready for development. Having to install specific packages and services, update software, manage users, etc., so I began using Puppet as a provisioning tool. Puppet is a very powerful piece of software which allows users to manage numerous servers from a central repository, and is widely used for IT ops throughout the software industry. Although there are numerous complexities of Puppet which, as a developer, are not essential to understand, having a general understanding of how Puppet works is extremely beneficial to setting up development environments.

Project Layout

As stated above, the goal of this project was to set up a system for rapid deployment of multiple server development environments. Each environment needed the ability to easily add and configure more servers. Servers needed a centralized way of being configured to prevent repeated work for each new instance.

At the top level of the directory structure sits two directories: puppet and vagrant. The puppet directory contains all of the Puppet modules and manifests organized by Linux distribution. The vagrant directory contains all the Vagrant configuration files for setting up the development environments.

At the core of Puppet sit manifests which are program files that define the state of a server. Things such as what services should be installed/running, which commands to run on setup, how to manage users and the filesystem, etc., are all controlled by manifest files via resource declarations. The manifest files can contain programming logic such as variable declarations, conditionals, and functions. To aid with organization Puppet offers classes to break out similar functionality into separate files, and modules to organize the classes. By using classes and modules Puppet program files can be written in a very concise, organized manner to allow for easy reuse in various different server configurations.

With that in mind, when navigating into puppet/centos there is a modules directory which contains all the various classes specific to what they offer: apache, mysql, php, etc. There is also a shell provisioning script which is used by Vagrant VMs so that Puppet will be installed on the new instances allowing them to use the Puppet modules. Inside the modules directories there are very specific classes pertaining to that module: apache::server, mysql::server, php::55, etc. This allows for the Vagrant VMs to use only what they need. Also, this keeps all the necessary configurations for the development environments in a single location. If new functionality is needed a single file can be created and reused amongst all virtual environments. If a change needs to be made to a software package, the configuration is located in a centralized location.

The vagrant directory contains the Vagrant configuration files to interact with the Puppet modules. Each environment is in it’s own directory and has it’s own Vagrantfile. The Vagrantfile is all that is need for a Vagrant VM to be created. All the specific configurations to the Vagrant cluster go inside of this file: port control, size restrictions, how many servers to include, which Puppet modules to use, etc. Notice each environment also has a manifests directory with Puppet files. All that the Puppet files do is include the needed Puppet modules for each of the servers in the environment. So looking at the lamp directory environment, there are three manifest files: default.pp, mysql.pp, web.pp. All that is contained inside of the manifests are include calls which define which Puppet modules need to be loaded. So the default.pp is run on all servers in the cluster, updating yum. The web.pp is ran on the web server, installing Apache and PHP, and the mysql.pp is run on the database server and installs MySQL. Note that the definitions for which server uses which manifest file are defined in the Vagrantfile. Now if other environments are set up and they need similar software, instead of having to be installed manually, they can run the same Puppet modules and the environment will be set up on VM load.

Hopefully I have demonstrated the usefulness of the Vagrant/Puppet set up. Vagrant environments can be created with as many servers as needed and provisioned with Puppet to allow for easy package/software management across servers and environments. Although this is a fairly simple use of Puppet, it is extremely useful from a developer's standpoint as all software management can be centralized and reused. This also allows for easy sharing of environments between developers. As I begin to develop on more advanced distributed systems, having a quick way to build development environments will offer a huge advantage. Distributed systems that use software such as Hadoop/Cassandra, MySQL replication/clusters, sharding, RabbitMQ, distributed caching, load balancing, etc. can now all be efficiently configured, built, and developed on locally.

Use Case

Here is an example of how this project can been used. In the development environment mysql-slaves, there is a multiple MySQL server development environment setup to test replication. In the Vagrantfile the master is defined along with a slave:

config.vm.define "slave1" do |slave1|
  slave1.vm.network "private_network", ip: "10.2.2.4"

  slave1.vm.provision "puppet" do |puppet|
    puppet.module_path   = PUPPET_MODULE_PATH
    puppet.manifest_file = "mysql.pp"
  end
end

To initiate the environment:

vagrant up

Now outside of some SSH tweaks for ease of use, there are two available servers, with specific IP addresses, that can be used to test different replication settings. Each server was configured using the same mysql::server Puppet class so software is installed and development is ready to begin.

After a simple master-slave setup has been configured and tested, more slaves can be added by repeating the above slave declarations in the Vagrantfile. Again, each server will be configured using the same mysql::server Puppet class, requiring no additional software configuration. Now we can test having multiple slaves reading from a single master, having a master-slave with multiple slaves reading from the slave, or practice hot swapping a slave and master to demonstrate server failure. Any of the MySQL replication topoligies can be easily configured and tested.

Once a replication setup has been chosen, next a developer needs to determine the best way to utilize it. Should the distributed read/write be done via a load balancer server such as HAProxy or done within the application itself with software similar to PHP's MySQL replication library. With the Vagrant/Puppet set up, adding a web server or load balancing server is as simple as defining the VM in the Vagrantfile and including the necessary Puppet classes. In the case that a new piece of software is needed on a server, instead of manually configuring the server a Puppet module/class can be created and reused across servers and projects.

Taking the above outlined development process and applying it to more advanced server architectures makes developing in a distributed, cloud environment much more efficient. Complex server architectures can be easily set up and shared on local environments, and development can be done without the need of working on a remote server.