Building, installing, and configuring your own Ruby RPM for CentOS

Published: July 23, 2016
If you're running your Ruby apps on your own server, you obviously need Ruby installed. The problem is the version of Ruby provided by the OS is commonly not what you want.

While Ruby managers like rbenv and rvm are great, I personally believe they should be only used for development purposes and not in a production environment. However, the packages that Linux distros provide are usually outdated. Luckily, there's another option: roll your own Ruby package. This allows you to use just about any version of Ruby you'd want AND it be handled by the distro's package manager. CentOS is my preferred distro to use for production, so I'll be going over the process of creating a RPM for Ruby 2.3 along with installing and configuring it on a production environment.

These instructions will work for both CentOS 6 and 7 and, in theory, should work for Red Hat Enterprise Linux.

Preperation

I prefer to create my package on a separate system from production. To accomplish this, I have a VirtualBox VM setup with the same distro version production uses (this is a requirement). In my case, I have a VM with CentOS 7 with all package updates applied. You can have either a minimal install (no GUI) or with a Desktop Environment such as GNOME. All commands (both in the VM and production) are done by a non-root user with sudo access.

The majority of the packages needed to create a RPM come from the "Developer Tools" group. The group includes programs such as gcc, rpm-build, and redhat-rpm-config among others. Not all are needed, but this is a good way to cover all your bases for basic package building in one go. All of them can be installed with the following command:

$ sudo yum groupinstall 'Development Tools'

Seting Up the Build Environment

Run these two commands to set up the needed directories and default configuration for RPM making:

$ mkdir -p ~/rpmbuild/{BUILD,BUILDROOT,RPMS,SOURCES,SPECS,SRPMS}
$ echo '%_topdir %(echo $HOME)/rpmbuild' > ~/.rpmmacros

The first command creates a series of directories in ~/rpmbuild/. The three directories we are most interested in are RPMS, SOURCES, and SPECS:

SOURCES - This directory will hold the compressed Ruby source code.
SPECS - In here, we'll put a file that describes how the package is to be built. A spec file has metadata about the package along with instructions on how to compile Ruby.
RPMS - When the package is built, the resulting .rpm file is placed in here.

The second command creates a configuration "dot file" for rpmbuild to tell it that all package-related files, by default, are all in ~/rpmbuild/.

Get Ruby Source Code and .spec File

Now we need the actual source code ruby-lang.org and put it in the SOURCES directory. This can be done with one command (Replace "2.3.1" with the version of Ruby you want to build):

$ cd ~/rpmbuild/SOURCES && wget ftp://ftp.ruby-lang.org/pub/ruby/ruby-2.3.1.tar.gz

Next, we need a spec file. You're kind of on your own at this point as there is no official spec file people use. The one I use is created by forcefeed on GitHub. The repo can be viewed here which has a spec file for Ruby versions (by switching tags) all the way back to 2.1.0. I'm going to use their latest spec file for Ruby 2.3.1.

Let's download it to our SOURCES directory. After downloading, you should view it in your favorite text editor and verify it looks on the up and up. You should never blindly download and use a spec file unless you understand what it's doing.

$ cd ~/rpmbuild/SPECS && wget https://raw.githubusercontent.com/feedforce/ruby-rpm/2.3.1/ruby.spec

I'm not going to go over all the details of the file, but I'll point out a couple key areas. If you'd like to follow along by just viewing the file, you can do so here.

%define rubyver         2.3.1

# This is the name of the package.
Name:           ruby

...

# These are packages that are required to be installed when the package installs. 
# Yum handles this for us.
Requires:       readline ncurses gdbm glibc openssl libyaml libffi zlib

# These packages are needed to create the RPM (In our case, to compile Ruby). 
# Building will fail if they are not all installed
BuildRequires:  readline-devel ncurses-devel gdbm-devel glibc-devel gcc openssl-devel make libyaml-devel libffi-devel zlib-devel

# This is a reference for the package to tell it where the source code came from. 
# I use this to know where to grab the source code to put in the SOURCES directory
Source0:        ftp://ftp.ruby-lang.org/pub/ruby/ruby-%{rubyver}.tar.gz

...

# This tells yum what packages and functions our RPM will provide.
Provides: ruby(abi) = 2.3
Provides: ruby-irb
Provides: ruby-rdoc
Provides: ruby-libs
Provides: ruby-devel
Provides: rubygems

...

# These are the commands, switches, and settings used to compile Ruby.
# For example, our Ruby won't have any of the tk hooks baked in (--without-tk).
%configure \
  --enable-shared \
  --disable-rpath \
  --without-X11 \
  --without-tk \
  --includedir=%{_includedir}/ruby \
  --libdir=%{_libdir}

make %{?_smp_mflags}

%install
make install DESTDIR=$RPM_BUILD_ROOT

...

# Rest of the file

Build the Package

Ok, let's try building. We use rpmbuild to create the package.

$ rpmbuild -ba ~/rpmbuild/SPECS/ruby.spec

Hopefully, if you've been following along, this command should fail:

error: Failed build dependencies:
    readline-devel is needed by ruby-2.3.1-1.el7.centos.x86_64
    ncurses-devel is needed by ruby-2.3.1-1.el7.centos.x86_64
    gdbm-devel is needed by ruby-2.3.1-1.el7.centos.x86_64
    openssl-devel is needed by ruby-2.3.1-1.el7.centos.x86_64
    libyaml-devel is needed by ruby-2.3.1-1.el7.centos.x86_64
    libffi-devel is needed by ruby-2.3.1-1.el7.centos.x86_64
    zlib-devel is needed by ruby-2.3.1-1.el7.centos.x86_64

This is rpmbuild letting us know that there's packages we specified as needed to compile Ruby that aren't installed. So, let's install them and try building again.

$ sudo yum install -y readline-devel ncurses-devel gdbm-devel openssl-devel libyaml-devel libffi-devel zlib-devel
$ rpmbuild -ba ~/rpmbuild/SPECS/ruby.spec

If all goes well, the package should build without a hitch. You'll see a bunch of blonds, brunettes, and redheads scroll through in your terminal while the Ruby compiling process happens. It's normal and things should complete eventually.

When it's all done, you should have a complete rpm file in ~/rpmbuild/RPMS. Depending on your version of CentOS, the names will vary slightly...

CentOS 6: ruby-2.3.1-1.el6.x86_64.rpm

CentOS 7: ruby-2.3.1-1.el7.centos.x86_64.rpm

Install on Production

Now that you have a rpm file, we can install it on the production server. I use scp to push the file up from my build VM to the server, but any standard file transfer process will work.

scp ~/rpmbuild/RPMS/ruby-2.3.1-1.el7.centos.x86_64.rpm myserver:~/ruby-2.3.1-1.el7.centos.x86_64.rpm

SSH into production. To have yum install the package, we use the --nogpgcheck as we did not sign our package.

$ sudo yum --nogpgcheck install ~/ruby-2.3.1-1.el7.centos.x86_64.rpm

If installation is successful, verify both the ruby and gem binaries are available and the versions are what you expect them to be.

$ ruby -v
ruby 2.3.1p112 (2016-04-26 revision 54768) [x86_64-linux]
$ gem -v
2.5.1

Hooray! We got Ruby installed and ready to go! We're done... right?

Allow Gems to Install

Our package provides irb, ruby, and gem, but we don't have bundle available. This is because, well, bundler is a gem! So we should probably install it...

$ gem install bundler
Fetching: bundler-1.12.5.gem (100%)
ERROR:  While executing gem ... (Gem::FilePermissionError)
    You don't have write permissions for the /usr/lib64/ruby/gems/2.3.0 directory.

Huh... we can't install bundler... in fact, we can't install any gems! This is due to how Ruby is installed by default. The default path for gems is in /usr/lib64/ruby/gems/2.3.0/ which is owned by root. We could use sudo to allow gems to install... but we really shouldn't need root access to install them.

Luckily, there's another way. We're going to tell Ruby to install gems in the home directory of the user. Each user on the system will be able to install gems for themselves because they will be installed to directories where they have write permissions. We can do this globally by creating a file at /etc/profile.d/ruby.sh. All files in /etc/profile.d/ are processed for all users to set up their environment. Our ruby.sh file will contain the following two lines:

export PATH=$HOME/.gem/ruby/2.3.0/bin:$PATH
export GEM_HOME=$HOME/.gem/ruby/2.3.0

These two lines do the following: Sets ~/.gem/ruby/2.3.0/ as the directory where all gems for the user will be installed and adds ~/.gem/ruby/2.3.0/bin/ to PATH which is where binaries (such as bundle, rake, etc) will go. Now, logout and log back into production to get the new environment set. With all that done, gems should install no problem for all users.

$ gem install bundler
Successfully installed bundler-1.12.5
Parsing documentation for bundler-1.12.5
Installing ri documentation for bundler-1.12.5
Done installing documentation for bundler after 3 seconds
1 gem installed

Success! A Ruby version we want is installed as system Ruby and gems can be installed. Congratulations. You've successfully built your first RPM and installed it on a production environment. High five


Seamlessly migrating from one git service to another for your team and capistrano

Published: July 15, 2016
I decided to move my projects from one hosted git service to a different one and needed to make the change near seamless for contributors and production

There are quite a decent number of services out there to host code repositories with Bitbucket, GitHub, and GitLab leading the pack. Sometimes one finds himself needing to switch services for whatever reason. Obviously this decision shouldn't be taken lightly and a decent amount of thought should be put into making the choice to move or not.

For me, personally, I choose to move my repos (all private) from a self-hosted GitLab instance to GitLab.com proper. I have a couple contributors to the repos so they will need to be able to seamlessly be able to switch to the new repo location. Most of these projects use capistrano to deploy to production, so that process will also need to be updated to point to the new repo. The process I took to do this can be applied to pretty much any hosted service and should be scalable to larger teams as long as proper communication happens.

Step 1: Push up and pull down all the things

To ensure a seamless transition, all WIP branches, changes, tags, etc from all contributors that are currently being tracked by origin should be pushed up before starting and at least one contributor should pull down (a simple git pull should handle this) all changes locally to serve as a canonical backup in case something goes wrong.

Step 2: Setup access and accounts

This is mostly administrative. Projects and accounts should be created on the new service for all contributors with proper push/pull access given where applicable.

Step 3: Point origin to the new location

Whichever service you switch should provide you with a clone url for the new repo. It's either a http(s) or SSH path ending in .git. This is what you'd normally use when cloning a repo for the first time (git clone git@somesite.com:user/project.git). Instead, we'll use the URL to point all local repos to the new service.

Each contributor should run this command at the root of their local copy of their project (replacing the url with the proper one to the new location of the repo):

git remote set-url origin git@gitlab.com:t27duck/my-project.git

This command will tell git that origin now points to a new location so all pushes and pulls will now draw from the other endpoint from now on.

Step 4: Migrate projects to new service

This step can be done in one or two ways. While I cannot speak for Bitbucket and other services (due to lack of experience with them), GitLab and GitHub both provide migration tools for moving repos from one service to another. If possible, use these services as they will commonly also migrate issues, wikis, Pull Requests/Merge Requests, etc along with the code. Note, for the case of private repos, most migration tools cannot access them across services (hence, the "private" part of "private repos"), so a common requirement to invoke a repo migration is to make the source repo temporarily public.

For those private projects where you do not wish to make public (even for 10 minutes)... or you raged quit your previous provider either before or after step 1 and the repo no longer exists, there's another option. This alternative method only moves the code. PRs, issues, etc won't be migrated as they are not part of the repository.

After completing step 3, one contributor should issue the following two commands in terminal:

git push --all
git push --tags

The first command pushes all branches, refs, changes, bits, bytes, whathaveyou. To the new origin. The second command pushes up any tags as, dispite what you'd assume it would do, the --all flag only pushes up branches and code. If you have a decently sized repo, it make take a few moments before the commands complete. Once that is done and you verify master, branches, and tags are all present and accounted for, contributors should be good to go for pushing and pulling as normal.

Step 5: Point deployment system to the new origin

The final step is to have your deployment system also pull from the new origin. I'm going to talk specifically about configuring capistrano as my projects are all Ruby web apps using capistrano for deployment.

First, modify config/deploy.rb to point to the new repo location. For capistrano 2.x, update the repository setting. In capistrano 3.x, it is call repo_url. This change will tell capistrano where the repo now lives when doing a full clone. Commit and push up the change.

At this point, you will still not be able to deploy as the cached version of the repo on production is still pointing to the old location. Attempting to deploy now will most likely cause "Cannot parse object 'abcdefg'" errors because the cached repo has no reference to the new refs coming from the new origin.

There are two options to fix this. You can either manually change the git configuration file to point to the new origin or flat out delete the cached repo. The latter will cause capistrano to do a full clone on the next deploy which will cause the new cached repo to automatically use the new endpoint.

Regardless of which option you take, the files live in different locations depending on the version of capistrano.

For 2.x:

The cached repo lives in {deploy_root}/shared/cached-copy

There's a .git/config file there that can be modified to point "origin" to the new location. Else, you could delete the entire cached-copy directory.

For 3.x:

The {deploy_root}/repo directory is what you want to delete. Alternatively, there's a config file that can be edited.

Once that is complete, everyone should be good to go. It's business as usual from that point forward.

Summary

It's relatively straightforward to migrate from one git provider to another. The trick is making sure all contributors and services properly point their repos to the new origin. As long as the migration process is properly communicated and executed, you shouldn't have any issues or lost data.


Migrating from unicorn to puma with no downtime

Published: June 26, 2016
I decided want to migrate a Rails 4.2 side project from unicorn to puma. The app has relatively high traffic throughout the day, so I want to avoid any visible downtime.

How production is setup

  • Ruby 2.2.4
  • unicorn (20 workers)
  • nginx proxying requests through the unicorn socket
  • systemd for starting and monitoring unicorn master process
  • capistrano for code deployment
  • App runs under an unprivileged user on a VPS (meaning, this blog post is not for Heroku users)

The prep work

The app is running Rails 4.2 plus a handful of various gems. Because puma is a multithreaded web server, the app and supporting gems need to be as thread-safe as possible.

My first step was to perform a brief audit of the app code to eliminate possible code paths are that not thread-safe. Using this excellent article by Jarkko Laine as a guide, I found that the majority of the already written app code should be thread-safe while making minor edits on any rough spots I came across.

Confident the app code is thread safe, I turned my attention to the 3rd party gems. I proceeded to audit each gem (and its dependencies) to see if there were any open issues regarding thread-safety. I "sledgehammered" some of the more "simple" gems I used such as acts_as_list by simple Google searches of "[gem name] thread safe" and seeing if anything came up. For other, larger and more mission-critical gems like devise, I dove into their respective Github issues to see if there are any open reports of thread-safety. Only the paperclip gem came up as a red flag with thread-safety issues. The cocaine gem, which is a dependency of paperclip, had a history of not being thread safe until version 0.5.5 (specifically this commit). For me, paperclip was the only gem I had to make sure was up to date. Your mileage may vary as everyone's gem lists for their apps are different.

After updating the app code (where applicable) and gems, I committed and deployed the changes to production.

Swap out the gems and setup configuration

With the code and gems audited, we're ready to setup puma. First, I swapped out the unicorn gem for puma in the Gemfile.

-gem 'unicorn', '~> 5.1.0'
+gem 'puma', '~> 3.4.0'

I found the default puma.rb configuration file provided with Rails 5 works well enough for local development, so I included it in my repo.

For production, I use a custom file:

bind "unix:///path/to/my/app-puma.sock"
pidfile "/path/to/my/app-puma.pid"

threads 2, 2
workers 10

environment "production"

prune_bundler

directory '/path/to/my/app/current'

A few points of interest with my production configuration:

  • I'm having the puma process create a pid and a sock file. This allows systemd to monitor the process via the pid and for nginx to connect to the upstream application through the Unix socket.
  • I'm enabling workers (puma's clustering mode) with each worker processing having a max of two threads. This effectively gives me the same concurrency as I had with unicorn for half the memory. As I get more confident in my thread-safety, I can begin to lower the number of workers while increasing the number of threads per worker.
  • I do not use the preload! option. This allows me to issue rolling restarts of the application code during deploys without requiring a full restart of puma. The trade-off of this is I lose the ability to take advantage of copy-on-write.
  • The prune_bundler option allows the main puma process to run detached from bundler. This allows for gem updates during deploys.
  • Since I'm using capistrano which uses symlinks to expose the updated app code, the directory is needed to make the puma executable properly follow the new symlink during deploys.

The production configuration is written out to /path/to/my/app/shared/config/ and deploy.rb for capistrano is configured to symlink the production puma configuration to the right spot for deploys.

After running bundle install to install puma and remove unicorn, I committed and pushed up the changes. Note that I did not deploy.

In preparation for the change-over, I temporarily commented out the hook in config/deploy.rb that would normally tell unicorn to reload.

namespace :deploy
  after :publishing, :restart
  desc 'Restart application'
  task :restart do
    on roles(:app), in: :sequence, wait: 5 do
      # execute "kill -s USR2 `cat /path/to/my/app-unicorn.pid`"
    end
  end
end

I then set up a new systemd service file that will manage starting and monitoring puma. I used the example systemd configuration provided by puma as a base.

[Unit]
Description=My App Puma Server
Requires=redis.service postgresql-9.4.service
Wants=redis.service postgresql-9.4.service memcached.service
After=redis.service postgresql-9.4.service

[Service]
Type=simple
User=appuser
PIDFile=/path/to/my/app-puma.pid
WorkingDirectory=/path/to/my/app/current
Environment=RAILS_ENV=production
ExecStart=/path/to/my/app/current/bin/bundle exec puma -e production -C ./config/puma.rb config.ru
ExecReload=/bin/kill -s USR1 $MAINPID
ExecStop=/bin/kill -s QUIT $MAINPID
Restart=always

[Install]
WantedBy=multi-user.target

In my service file, I specified a PIDFile so I can send signals to the pid for reloading and stopping the puma process from systemd. Note that for puma, we use USR1 for rolling restarts instead of USR2 which is what unicorn uses.

In my configuration in nginx, I pointed to where the new puma socket will in the upstream section, but I do not reload the nginx configuration at this point.

The plan is to run both unicorn and puma side-by-side. While I did not incur downtime, I temporarily sacrificed some concurrency by cutting the number of unicorn workers in half. I issued the TTOU signal to the master unicorn process ten times to decrement the number of children. The stage is now set.

Executing the switch over

Show time! With the deploy file for capistrano having the grounded out restart task, I issued a production deploy. This will push the new Gemfile to production, install the new puma gem, and remove unicorn. However, since the restart task is doing nothing, the old unicorn process is still running in memory.

Now that the new code, configurations, and gems are in place, I started up the puma cluster via systemd:

$ sudo systemctl start myapp-puma.service
# Wait a bit... let's check to make sure it's up...
$ sudo systemctl status myapp-puma.service
* myapp-puma.service - My App Puma Server
   Loaded: loaded (/etc/systemd/system/myapp-puma.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2016-06-24 02:14:19 UTC; 2 days ago
 Main PID: 4025 (ruby)
   CGroup: /system.slice/myapp-puma.service
           |-- 308 puma: cluster worker 0: 4025 [20160624015128]
           |-- 467 puma: cluster worker 1: 4025 [20160624015128]
            ....
           |-- 4025 puma 3.4.0 (unix:///path/to/my/app-puma.sock,tcp://0.0.0.0:3000) [20160624015128]

Awesome! I now have puma and unicorn running side-by-side. Now's the time for the actual switchover.

Remember before that I updated my nginx config to point to the new socket? I never reloaded the nginx daemon, so the in-memory process is still pointing to the still running unicorn sock. Switch it is now just a simple reload command to nginx:

$ sudo systemctl reload nginx

Boom! The app still comes up in the browser. We're now running on puma without any user noticing a thing.

Cleanup

Time to say goodbye to unicorn and cement the transfer to puma.

I turned off the unicorn process and disabled the service to prevent it from trying to start up the next time the server reboots. Then, I want to have puma start up automatically.

$ sudo systemctl stop myapp-unicorn.service
$ sudo systemctl disable myapp-unicorn.service
$ sudo systemctl enable myapp-puma.service

Finally, I updated and committed the change to the config/deploy.rb capistrano file to issue a rolling restart of the puma cluster.

namespace :deploy
  after :publishing, :restart
  desc 'Restart application'
  task :restart do
    on roles(:app), in: :sequence, wait: 5 do
       execute "kill -s USR1 `cat /path/to/my/app-puma.pid`"
    end
  end
end

MISSION ACCOMPLISHED

Recap

I did a lot of pre-planning and setup before actually got puma running on production. Good planning and organization is key for a successful, seamless switchover. I was fortunate enough to come up with this plan and it worked near flawlessly. Your experience will probably be different than mine, but hopefully what I showed you above can act as a starting point for how you do your switch over.

Tags: ruby, rails, server

I am Tony Drake - Professional web developer, geek, and Mario Kart connoisseur. This site is where I put my ramblings and talk about my interests.

© 2017 t27duck