Deploying a classification model (with Flask, Apache, virtual environments and WSGI)

In which we deploy our classifier to a cloud server.

I thought training the classifier was the difficult part but it turns out that deployment was the real test… Here I record the steps I took and issues I encountered to get my classifier ‘whats-that-tree’ working in the cloud as a reminder to self, and to help anyone who might have similar issues!

Setup

Cloud server: DigitalOcean
App framework: Flask
OS: Ubuntu 16.04
Server memory: 2GB
Web server: Apache

One of the main hurdles was getting fastai installed inside a virtual environment and having my web server use that environment to serve the Flask application. I’ve used mod_wsgi with Flask before to set up another project but it was a few years ago so I needed some refreshing on the details. The Flask documentation is a good starting point and so is this guide. ~~However, neither touch on how to have mod_wsgi serve the app with a specific virtual environment.~~ (See end of post / Flask documentation)

Step 1

Install python3 (I installed version 3.6), virtualenv, apache2 if you haven’t already:

sudo apt-get update
sudo apt-get install python3.6
pip install virtualenv
sudo apt-get install apache2
sudo apt-get install apache2-dev

The last command is for apache2 dev tools. I can’t remember if that’s necessary now but I did install it at some point so throwing it in for good measure.

Step 2

Create a virtual environment for the Flask app:

virtualenv -p python3.6 venv

Activate the virtual environment:

. venv/bin/activate

You need to install a mod_wsgi version for python3:

pip3 install mod_wsgi

Now to install fastai - since my server only has CPU I opted to first install a CPU version of pytorch before installing fastai:

pip3 install torch==1.5.0+cpu torchvision==0.6.0+cpu -f https://download.pytorch.org/whl/torch_stable.html  
pip3 install fastai

You can find the correct installation verion for your server here.

During the installation of fastai, I hit an issue installing bottleneck. If I remember correctly it was to do with this error:
fatal error: Python.h: No such file or directory compilation terminated. If you come across this, you need to install python dev:

sudo apt install python3.6-dev

Step 3

Now to serve the Flask app using WSGI and Apache. You need to create a file app.wsgi which Apache uses to serve the app. I used the following folder structure:

|----whats-that-tree
|---------app.wsgi
|---------venv
|---------app
|--------------.env
|--------------__init__.py
|--------------modules
|--------------README.md
|--------------static
|--------------templates

In app.wsgi:

#!/usr/bin/python
import sys
sys.path.insert(0,"/path/to/whats-that-tree/")

from app import app as application

Where the path to pass to sys.path.insert is the path to the app directory.

Create a whats-that-tree.conf file:

<VirtualHost *:80>
        ServerName your_domain
        ServerAdmin your@gmail.com

        LoadModule wsgi_module "/path/to/whats-that-tree/venv/lib/python3.6/site-packages/mod_wsgi/server/mod_wsgi-py36.cpython-36m-x86_64-linux-gnu.so"
        WSGIDaemonProcess app_name python-home="/path/to/whats-that-tree/venv"
        WSGIProcessGroup app_name
        WSGIApplicationGroup %{GLOBAL}

        WSGIScriptAlias / /path/to/whats-that-tree/app.wsgi

        <Directory "/path/to/whats-that-tree/app">
                Options Indexes FollowSymLinks
                AllowOverride All
                Require all granted
        </Directory>
        Alias /static /path/to/whats-that-tree/app/static
        <Directory "/path/to/whats-that-tree/app/static">
                Order allow,deny
                Allow from all
        </Directory>
        ErrorLog ${APACHE_LOG_DIR}/error.log
        CustomLog ${APACHE_LOG_DIR}/access.log combined
</VirtualHost>

You will need to use LoadModule to load mod_wsgi from the correct path in your virtual environment. You can find the correct directives with command:

mod_wsgi-express module-config

See: https://pypi.org/project/mod-wsgi/

Step 4

Enable the site with sudo a2ensite whats-that-tree and reload apache sudo service apache2 reload: it should work if everything is configured correctly.
One last thing, I received many ‘Truncated or oversized response headers received from daemon process’ error messages from Apache and I tried many many fixes. Turns out my server just needed more memory (it initially had 1GB) and scaling up to 2GB was enough!

Fin

Turns out there’s a easier way (somewhat) to get Apache to use the virtual environment. I completely missed it in this guide. You can add this to the top of the app.wsgi file:

activate_this = '/path/to/venv/bin/activate_this.py'
with open(activate_this) as file_:
    exec(file_.read(), dict(__file__=activate_this))

The LoadModule directive is no longer needed and you can replace:

WSGIDaemonProcess app_name python-home="/path/to/whats-that-tree/venv"

With:

WSGIDaemonProcess app_name

That’s it!

Links

My classifier app on GitHub: https://github.com/suet-lee/whats-that-tree
A live version of the classifier, go find a tree to classify! http://whats-that-tree.suetlee.com

Suet Lee