Migrating from unicorn to puma with no downtime
Published: June 26, 2016
How production is setup
- Ruby 2.2.4
- unicorn (20 workers)
- nginx proxying requests through the unicorn socket
- systemd for starting and monitoring unicorn master process
- capistrano for code deployment
- App runs under an unprivileged user on a VPS (meaning, this blog post is not for Heroku users)
The prep work
The app is running Rails 4.2 plus a handful of various gems. Because puma is a multithreaded web server, the app and supporting gems need to be as thread-safe as possible.
My first step was to perform a brief audit of the app code to eliminate possible code paths are that not thread-safe. Using this excellent article by Jarkko Laine as a guide, I found that the majority of the already written app code should be thread-safe while making minor edits on any rough spots I came across.
Confident the app code is thread safe, I turned my attention to the 3rd party gems. I proceeded to audit each gem (and its dependencies) to see if there were any open issues regarding thread-safety. I "sledgehammered" some of the more "simple" gems I used such as acts_as_list
by simple Google searches of "[gem name] thread safe" and seeing if anything came up. For other, larger and more mission-critical gems like devise
, I dove into their respective Github issues to see if there are any open reports of thread-safety. Only the paperclip
gem came up as a red flag with thread-safety issues. The cocaine
gem, which is a dependency of paperclip, had a history of not being thread safe until version 0.5.5 (specifically this commit). For me, paperclip was the only gem I had to make sure was up to date. Your mileage may vary as everyone's gem lists for their apps are different.
After updating the app code (where applicable) and gems, I committed and deployed the changes to production.
Swap out the gems and setup configuration
With the code and gems audited, we're ready to setup puma. First, I swapped out the unicorn gem for puma in the Gemfile.
-gem 'unicorn', '~> 5.1.0' +gem 'puma', '~> 3.4.0'
I found the default puma.rb
configuration file provided with Rails 5 works well enough for local development, so I included it in my repo.
For production, I use a custom file:
bind "unix:///path/to/my/app-puma.sock" pidfile "/path/to/my/app-puma.pid" threads 2, 2 workers 10 environment "production" prune_bundler directory '/path/to/my/app/current'
A few points of interest with my production configuration:
- I'm having the puma process create a pid and a sock file. This allows systemd to monitor the process via the pid and for nginx to connect to the upstream application through the Unix socket.
- I'm enabling workers (puma's clustering mode) with each worker processing having a max of two threads. This effectively gives me the same concurrency as I had with unicorn for half the memory. As I get more confident in my thread-safety, I can begin to lower the number of workers while increasing the number of threads per worker.
- I do not use the
preload!
option. This allows me to issue rolling restarts of the application code during deploys without requiring a full restart of puma. The trade-off of this is I lose the ability to take advantage of copy-on-write. - The
prune_bundler
option allows the main puma process to run detached from bundler. This allows for gem updates during deploys. - Since I'm using capistrano which uses symlinks to expose the updated app code, the
directory
is needed to make the puma executable properly follow the new symlink during deploys.
The production configuration is written out to /path/to/my/app/shared/config/
and deploy.rb
for capistrano is configured to symlink the production puma configuration to the right spot for deploys.
After running bundle install
to install puma and remove unicorn, I committed and pushed up the changes. Note that I did not deploy.
In preparation for the change-over, I temporarily commented out the hook in config/deploy.rb
that would normally tell unicorn to reload.
namespace :deploy after :publishing, :restart desc 'Restart application' task :restart do on roles(:app), in: :sequence, wait: 5 do # execute "kill -s USR2 `cat /path/to/my/app-unicorn.pid`" end end end
I then set up a new systemd service file that will manage starting and monitoring puma. I used the example systemd configuration provided by puma as a base.
[Unit] Description=My App Puma Server Requires=redis.service postgresql-9.4.service Wants=redis.service postgresql-9.4.service memcached.service After=redis.service postgresql-9.4.service [Service] Type=simple User=appuser PIDFile=/path/to/my/app-puma.pid WorkingDirectory=/path/to/my/app/current Environment=RAILS_ENV=production ExecStart=/path/to/my/app/current/bin/bundle exec puma -e production -C ./config/puma.rb config.ru ExecReload=/bin/kill -s USR1 $MAINPID ExecStop=/bin/kill -s QUIT $MAINPID Restart=always [Install] WantedBy=multi-user.target
In my service file, I specified a PIDFile
so I can send signals to the pid for reloading and stopping the puma process from systemd. Note that for puma, we use USR1
for rolling restarts instead of USR2
which is what unicorn uses.
In my configuration in nginx, I pointed to where the new puma socket will in the upstream
section, but I do not reload the nginx configuration at this point.
The plan is to run both unicorn and puma side-by-side. While I did not incur downtime, I temporarily sacrificed some concurrency by cutting the number of unicorn workers in half. I issued the TTOU
signal to the master unicorn process ten times to decrement the number of children. The stage is now set.
Executing the switch over
Show time! With the deploy file for capistrano having the grounded out restart task, I issued a production deploy. This will push the new Gemfile to production, install the new puma gem, and remove unicorn. However, since the restart task is doing nothing, the old unicorn process is still running in memory.
Now that the new code, configurations, and gems are in place, I started up the puma cluster via systemd:
$ sudo systemctl start myapp-puma.service # Wait a bit... let's check to make sure it's up... $ sudo systemctl status myapp-puma.service * myapp-puma.service - My App Puma Server Loaded: loaded (/etc/systemd/system/myapp-puma.service; enabled; vendor preset: disabled) Active: active (running) since Fri 2016-06-24 02:14:19 UTC; 2 days ago Main PID: 4025 (ruby) CGroup: /system.slice/myapp-puma.service |-- 308 puma: cluster worker 0: 4025 [20160624015128] |-- 467 puma: cluster worker 1: 4025 [20160624015128] .... |-- 4025 puma 3.4.0 (unix:///path/to/my/app-puma.sock,tcp://0.0.0.0:3000) [20160624015128]
Awesome! I now have puma and unicorn running side-by-side. Now's the time for the actual switchover.
Remember before that I updated my nginx config to point to the new socket? I never reloaded the nginx daemon, so the in-memory process is still pointing to the still running unicorn sock. Switch it is now just a simple reload command to nginx:
$ sudo systemctl reload nginx
Boom! The app still comes up in the browser. We're now running on puma without any user noticing a thing.
Cleanup
Time to say goodbye to unicorn and cement the transfer to puma.
I turned off the unicorn process and disabled the service to prevent it from trying to start up the next time the server reboots. Then, I want to have puma start up automatically.
$ sudo systemctl stop myapp-unicorn.service $ sudo systemctl disable myapp-unicorn.service $ sudo systemctl enable myapp-puma.service
Finally, I updated and committed the change to the config/deploy.rb
capistrano file to issue a rolling restart of the puma cluster.
namespace :deploy after :publishing, :restart desc 'Restart application' task :restart do on roles(:app), in: :sequence, wait: 5 do execute "kill -s USR1 `cat /path/to/my/app-puma.pid`" end end end
MISSION ACCOMPLISHED
Recap
I did a lot of pre-planning and setup before actually got puma running on production. Good planning and organization is key for a successful, seamless switchover. I was fortunate enough to come up with this plan and it worked near flawlessly. Your experience will probably be different than mine, but hopefully what I showed you above can act as a starting point for how you do your switch over.