Skip to content

Latest commit

 

History

History
264 lines (205 loc) · 8.86 KB

README.rst

File metadata and controls

264 lines (205 loc) · 8.86 KB

Open Prose Metrics

This repo is for building Open Prose Metrics appliance with TKLDEV from Turnkey GNU/Linux. (based on their LAMP and Core appliances).


What about it?

Feed this web application a file, url, or paste with prose in English and it will provide data from which an assessment can be made, for the purposes of:

  • planning substantive revision
  • setting writing goals or objectives, as case managers might do
  • track a your progress toward a writing objective or accomplishing a challenge
  • or, track the progress of a student toward written langauge goals, or written expression or mechanics objects.
  • Generate speech from text for proofing or an accessibility toool (downloadable as mp3)

Python3, Flask, Apache2, mod_wsgi, stanford_ner, and NLTK are involved.

.art/screenshot.png

.art/screenshot2.png

proposed logo

Proposed logo if ultimately used by TKL

Tested Environment

Linux: Debian Buster (TKL LAMP Appliance)

Python: Python3.7.3

According to flask --version:

Python 3.7.3 Flask 1.1.1 Werkzeug 0.16.0

Apache2 with mod_wsgi for python3

Try It

File structure, rough deployment script, building mod_wsgi from source, and a sample apache2 virtual server conf for the application are provided below.

demo server at https://opm.gonzotraining.com. interface is a bit different than what’s provided here.

There are vars at the top of the conf.d/main. TARGET DIRECTORY sets the fs location of of the parent of app, which contains not only app but also virtualenv.

File Structure

Not for this repo. This is close though

/
--/home/$USER/opm/ # app home. No ___init__.py present.
  |--wsgi.py # gunicorn wsgi file
  |--input/read_document # from opm/app/input/ directory, not opm/opm/input/
--/tmp/
  |--gunicorn.sock # created dynamically as needed. Will have permissions from service definition(?)
--/usr/share/
  |--stanfor-ner/ # create using sudo as $USER (owner of app);
  |--nltk-data/ # create directory, then python -m nltk.downloader -d /usr/share/nltk_data stopwords words punkt brown vader_lexicon averaged_perceptron_tagger maxent_ne_chunker
--/etc/
  |--systemd/system/gunicorn.service # specifically for opm
  |--/etc/apache2/sites-available/opm.conf #
--/var/
  |--log/opm/ # todo: change to opm
     |--error.log # for the apache vhost
  |--log/extraeyes/
     |--app.log # flask output to stdout and stderror when running with flask app in venv

Draft Deployment: Flask, Apache2 w mod_wsgi (ubuntu-18.04)

Example steps to get up and running - reflected in the code in this repo but not used. ()This is for a vagrant install)

application user: vagrant

#!/bin/bash -ex


INSTALLER_DIR="/vagrant"
TARGET_DIR="/var/www/html/open-prose-metrics"
APP_OWNER="vagrant"
APP_GROUP="www-data"
LOG_DIR="/var/log/opm"
VENV="/var/www/html/open-prose-metrics/virtualenv"
NER_DIR="/usr/share"
NLTK_DATA_DIR="/usr/share/nltk_data"

# dpkg-reconfigure tzdata (automation?)
ln -fs /usr/share/zoneinfo/US/Eastern /etc/localtime && dpkg-reconfigure -f noninteractive tzdata

# Repository - General
echo "Install from Ubuntu Package Manager"
apt update && apt upgrade -y
apt install -y linux-headers-$(uname -r)
apt install -y apache2 python3 python3-dev python3-pip
apt install -y python3.7 python3.7-dev
apt install -y libenchant-dev unzip openjdk-8-jre-headless
# Repository - depends for pycurl
apt install -y libcurl4-openssl-dev libssl-dev
# Repository - for scipy and numpy
apt install -y libatlas-base-dev
# Repository - possibly required by numpy and scipy
apt install -y libpython3.7 libpython3.7-dev gcc gfortran python-dev libopenblas-dev liblapack-dev cython
# Repository - compiling mod-wsgi for python: dependencies
apt install -y apache2-dev

# Stanford NER (Required by nltk for opm NER)
wget https://nlp.stanford.edu/software/stanford-ner-4.0.0.zip
unzip stanford-ner-4.0.0.zip -d $NER_DIR
chown -R $APP_OWNER:$APP_GROUP $NER_DIR
mv $NER_DIR/stanford-ner-4.0.0 $NER_DIR/stanford-ner


# depends for pycurl
apt install -y libcurl4-openssl-dev libssl-dev
# Pip: system-wide dependencies
echo "Get Pip for Python3.7"
curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
python3.7 get-pip.py
pip3.7 install wheel virtualenv ipython
echo "Create VENV"
virtualenv -p python3.7 $VENV
source $VENV/bin/activate
echo "Install from venv pip"
pip install -r /vagrant/opm/requirements.txt
# opm
echo "Copy App Folder"
cp -r /vagrant/opm $TARGET_DIR/opm # move app folder
#Create and set permissions
echo "Create Files and Directories and Set Permissions"
mkdir /var/log/extraeyes /var/log/opm /home/$APP_OWNER/.plotly
touch /var/log/extraeyes/app.log
chown -R $APP_OWNER:$APP_GROUP /var/log/extraeyes
chown -R $APP_OWNER:$APP_GROUP /var/log/opm
chown -R $APP_OWNER:$APP_GROUP /home/$APP_OWNER/.plotly
chmod 775 /home/$APP_OWNER/.plotly
chown -R $APP_OWNER:$APP_GROUP $TARGET_DIR
#nltk_data
echo "Fetching NLTK data"
python -m nltk.downloader -d /usr/share/nltk_data stopwords words punkt brown vader_lexicon averaged_perceptron_tagger maxent_ne_chunker #for ner
echo "Pre-create Postagger (depends on stanford NER)"
chown -R $APP_OWNER:$APP_GROUP $NER_DIR/stanford-ner
cd $TARGET_DIR/opm && python postagger.py
echo "Seeding Database and Testing Backend"
cd $TARGET_DIR/opm && python seed_database.py
echo "Configuring Apache2"
cp $INSTALLER_DIR/opm-configs/etc/apache2/sites-available/*.conf /etc/apache2/sites-available/
sed -i "s|{{ owner }}|$APP_OWNER|" /etc/apache2/sites-available/opm.conf
sed -i "s|{{ group }}|$APP_GROUP|" /etc/apache2/sites-available/opm.conf
a2dissite 000-default
a2ensite opm.conf
apt clean
#systemctl reload apache2
# END - mod_wsgi will be called
echo "Finished. Next, wsgi-install will be called to compile, install mod_wsgi and reload apache2"

Configure, Make, Install mod_wsgi

Not necessary here - Buster mod-wsgi-py3 works fine.

cd /vagrant/
unzip /vagrant/4.7.1.zip
cd mod_wsgi-4.7.1/
./configure --with-python=/usr/bin/python3.7
make
make install

#cp /usr/lib/apache2/modules/mod_wsgi.so /etc/apache2/modules/
chmod 644 /usr/lib/apache2/modules/mod_wsgi.so
systemctl reload apache2

http server conf apache2

Example. Reflected in this repo but not used.

LoadModule wsgi_module /usr/lib/apache2/modules/mod_wsgi.so

<VirtualHost *>
        ServerName blah.com
        ServerAlias opm.blah.com
        ServerAlias essayeyes.com
        #WSGIDaemonProcess opm user={{ owner }} group={{ group }} threads=5
        WSGIDaemonProcess opm user=vagrant group=www-data threads=5 python-path=/var/www/html/open-prose-metrics/opm:/var/www/html/open-prose-metrics/virtualenv/lib/python3.7/site-packages
        WSGIScriptAlias / /var/www/html/open-prose-metrics/opm/opm.wsgi
        <Directory /var/www/html/open-prose-metrics>
                WSGIProcessGroup opm
                WSGIApplicationGroup %{GLOBAL}
                #Order deny,allow
                Allow from all
        </Directory>
        ErrorLog /var/log/opm/error.log
</VirtualHost>

Sample Vagrant File

Not pertinent here.

# -*- mode: ruby -*-
# vi: set ft=ruby :

Vagrant.configure("2") do |config|
  config.vm.box = "bento/ubuntu-18.04"
  config.vm.network "forwarded_port", guest: 80, host: 8081
  config.vm.network "forwarded_port", guest: 5000, host: 5000
  #config.vm.provision "file", source: "opm", destination: "~/opm"
  #config.vm.provision "file", source: "flask-nginx-gunicorn/gunicorn/systemd/gunicorn.service", destination: "/etc/systemd/system/gunicorn.service"
  #config.vm.provision "file", source: "flask-nginx-gunicorn/nginx/opm", destination: "/etc/nginx/sites-available/opm"
  config.vm.provision "shell", path: "deploy.sh"
  config.vm.provision "shell", path: "wsgi-install.sh"
  config.vm.provider :virtualbox do |vb|
  #   # Don't boot with headless mode
  #   vb.gui = true
  #
  #   # Use VBoxManage to customize the VM. For example to change memory:

    vb.customize ["modifyvm", :id, "--memory", "1024", "--cpus", "2"] # once installed, 2048 is ok. One vcpu is file
  end
end

© rik goldman, MIT LICENSE

High-school students at Chelsea School in Hyattsville, MD, gave encouragement and supported initial explorations into just how viable this project would be.