At work I have wanted to implement system monitoring using graphite and grafana. Eventually I will use this particularly to monitor our Lustre storage system with either collectl or collectd shipping Lustre and general host stats to graphite. To try things out I have installed the stack onto my Kimsufi server.
The graphite installation documentation uses apache mod_fcgi, but I'm running various other python projects with gunicorn and then an NGINX or Apache front-end proxy, so will use gunicorn to run the graphite-web app, and NGINX to proxy it and grafana over https.
I run CentOS 7 on my Kimsufi 2c server, and the procedure to install graphite, grafana, gunicorn, nginx is as follows:
yum -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
# Distribution packages yum install -y httpd net-snmp perl python-devel git gcc-c++ pycairo mod_wsgi libffi-devel yum install -y python-pip node npm # Python package via pip pip install django pip install django-tagging pip install pytz pip install Twisted==16.4.1 # Installing graphite from master, as the stable release is quite old at the moment export PYTHONPATH="/opt/graphite/lib/:/opt/graphite/webapp/" pip install --no-binary=:all: https://github.com/graphite-project/whisper/tarball/master pip install --no-binary=:all: https://github.com/graphite-project/carbon/tarball/master pip install --no-binary=:all: https://github.com/graphite-project/graphite-web/tarball/master
sudo cp /opt/graphite/conf/storage-schemas.conf.example /opt/graphite/conf/storage-schemas.conf sudo cp /opt/graphite/conf/storage-aggregation.conf.example /opt/graphite/conf/storage-aggregation.conf sudo cp /opt/graphite/conf/graphTemplates.conf.example /opt/graphite/conf/graphTemplates.conf sudo cp /opt/graphite/conf/graphite.wsgi.example /opt/graphite/conf/graphite.wsgi sudo cp /opt/graphite/webapp/graphite/local_settings.py.example /opt/graphite/webapp/graphite/local_settings.py sudo cp /opt/graphite/conf/carbon.conf.example /opt/graphite/conf/carbon.conf
vi /opt/graphite/conf/storage-schemas.conf
Add the default retention:
[default] pattern = .* retentions = 12s:4h, 2m:3d, 5m:8d, 13m:32d, 1h:1y
useradd -d /opt/graphite graphite chown graphite /opt/graphite -R
vi /opt/graphite/conf/storage-schemas.conf
Make these edits:
USER = graphite
Create system.d unit file /etc/systemd/system/carbon.service
[Unit] Description = Carbon Metrics store [Service] Type = forking GuessMainPID = false PIDFile = /opt/graphite/storage/carbon-cache-a.pid ExecStart = /opt/graphite/bin/carbon-cache.py start [Install] WantedBy = multi-user.target
sytemctl daemon-reload systemctl start carbon systemctl status carbon systemctl enable carbon
Edit /opt/graphite/webapp/graphite/local_settings.py
SECRET_KEY = '' # Get a random hash to put here DEBUG = False
Install nginx, gunicorn:
yum install nginx pip install gunicorn
Setup webapp:
cp /opt/graphite/conf/graphite.wsgi.example /opt/graphite/webapp/wsgi.py cd /opt/graphite/webapp/graphite sudo -u graphite python manage.py migrate
Create /etc/systemd/system/graphite-web.service:
[Unit] [Service] Environment=PYTHONPATH="/opt/graphite/lib/:/opt/graphite/webapp/" WorkingDirectory=/opt/graphite/webapp ExecStart=/usr/bin/gunicorn -u graphite -g graphite -b 127.0.0.1:8080 --log-file=/opt/graphite/storage/log/webapp/gunicorn.log wsgi:application Restart=on-failure User=graphite Group=graphite ExecReload=/bin/kill -s HUP $MAINPID ExecStop=/bin/kill -s TERM $MAINPID PrivateTmp=true [Install] WantedBy=multi-user.target
systemctl daemon-reload systemctl start graphite-web systemctl status graphite-web systemctl enable graphite-web
Setup a self-signed certificat for nginx right now
cd /etc/nginx sudo openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout /etc/ssl/private/nginx-selfsigned.key -out /etc/ssl/certs/nginx-selfsigned.crt
Create /etc/nginx/conf.d/graphite.conf
server { listen 80; return 301 https://$host$request_uri; } server { listen 443; server_name graphite.example.com; ssl_certificate /etc/nginx/nginx-selfsigned.crt; ssl_certificate_key /etc/nginx/nginx-selfsigned.key; ssl on; ssl_session_cache builtin:1000 shared:SSL:10m; ssl_protocols TLSv1 TLSv1.1 TLSv1.2; ssl_ciphers HIGH:!aNULL:!eNULL:!EXPORT:!CAMELLIA:!DES:!MD5:!PSK:!RC4; ssl_prefer_server_ciphers on; location /graphite/static/ { alias /opt/graphite/webapp/content/; } location /graphite { proxy_set_header Host $host; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; # Fix the “It appears that your reverse proxy set up is broken" error. proxy_pass http://127.0.0.1:8080/graphite; proxy_read_timeout 90; proxy_redirect http://127.0.0.1:8080/graphite https://graphite.example.com/graphite; } }
Start and open firewall
firewall-cmd --add-service=http firewall-cmd --add-service=https firewall-cmd --runtime-to-permanent systemctl start nginx systemctl enable nginx
Will use collectl running locally on the same server to make sure that data is getting into graphite.
yum install collectl # Test it collectl --export graphite,127.0.0.1 # Configure in /etc/collectl.conf DaemonCommands = -smcdn --export=graphite,127.0.0.1 # Start systemctl start collectl systemctl enable collectl
Now browse to ~https://localhost/graphite~ and check that data is making it into graphite, and the graphite-web interface is working OK.
Grafana is a nicer dashboard front-end than the default graphite-web, so we will use that by default.
yum install https://s3-us-west-2.amazonaws.com/grafana-releases/release/grafana-4.2.0-1.x86_64.rpm systemctl daemon-reload systemctl start grafana-server systemctl status grafana-server systemctl enable grafana-server
Add nginx proxy setup to /etc/nginx/conf.d/graphite.conf
location / { proxy_set_header Host $host; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; # Fix the “It appears that your reverse proxy set up is broken" error. proxy_pass http://127.0.0.1:3000/; proxy_read_timeout 90; }
systemctl restart nginx