supervisor的应用

:-}

简介

Supervisor是用Python开发的一个client/server服务,是Linux/Unix系统下的一个进程管理工具,不支持Windows系统。它可以很方便的监听、启动、停止、重启一个或多个进程。用Supervisor管理的进程,当一个进程意外被杀死,supervisort监听到进程死后,会自动将它重新拉起,很方便的做到进程自动恢复的功能,不再需要自己写shell脚本来控制。

安装

pip install supervisor

查看任务状态

# supervisorctl status
SmartCoin                        RUNNING   pid 13203, uptime 0:04:05
coin                             RUNNING   pid 30744, uptime 17 days, 20:45:18
deepwellserver                   RUNNING   pid 30257, uptime 30 days, 4:13:01
jingtumassetapi                  RUNNING   pid 14536, uptime 45 days, 19:18:08
moac                             RUNNING   pid 20015, uptime 15 days, 5:15:11
new                              RUNNING   pid 10041, uptime 43 days, 22:41:56
nginx                            RUNNING   pid 18752, uptime 22:59:40
redis                            RUNNING   pid 14542, uptime 45 days, 19:18:08
sonyflakeserver                  FATAL     can't find command 'go'
sparkportal                      RUNNING   pid 26073, uptime 1 day, 23:11:17
sparkportal2                     RUNNING   pid 25732, uptime 1 day, 23:11:21
sparkportal3                     RUNNING   pid 25834, uptime 1 day, 23:11:20
sparkportal4                     RUNNING   pid 25974, uptime 1 day, 23:11:18
sparkuser                        RUNNING   pid 26957, uptime 9 days, 23:07:21
sparkwallet                      RUNNING   pid 29045, uptime 5 days, 15:11:58
summaryservice                   RUNNING   pid 14535, uptime 45 days, 19:18:08

第一列是服务名;第二列是运行状态,RUNNING表示运行中,FATAL 表示运行失败,STARTING表示正在启动,STOPED表示任务已停止; 第三/四列是进程号,最后是任务已经运行的时间。

##查看单个任务状态
supervisorctl status 服务名

# supervisorctl status sparkportal
sparkportal                      RUNNING   pid 26073, uptime 1 day, 23:12:10

##启动任务
supervisorctl start 服务名

# supervisorctl stop sparkportal
sparkportal: stopped
#supervisorctl status sparkportal
sparkportal                      STOPPED   Jan 05 01:59 PM

##停止任务
supervisorctl stop 服务名

# supervisorctl start sparkportal
sparkportal: started
# supervisorctl status sparkportal
sparkportal                      RUNNING   pid 32207, uptime 0:00:05

##重启任务

supervisorctl restart 服务名

# supervisorctl restart sparkportal
sparkportal: stopped
sparkportal: started
# supervisorctl status sparkportal
sparkportal                      RUNNING   pid 4952, uptime 0:00:03

##新增任务

任务模板
[program:<服务名>]
command=<启动命令>
process_name=%(program_name)s ; process_name expr (default %(program_name)s)
numprocs=1 ; number of processes copies to start (def 1)
directory=<运行目录> ; directory to cwd to before exec (def no cwd)
;umask=022 ; umask for process (default None)
;priority=999 ; the relative start priority (default 999)
autostart=true ; start at supervisord start (default: true)
autorestart=unexpected ; whether/when to restart (default: unexpected)
startsecs=1 ; number of secs prog must stay running (def. 1)
startretries=3 ; max # of serial start failures (default 3)
exitcodes=0,2 ; ‘expected’ exit codes for process (default 0,2)
stopsignal=QUIT ; signal used to kill process (default TERM)
stopwaitsecs=10 ; max num secs to wait b4 SIGKILL (default 10)
stopasgroup=false ; send stop signal to the UNIX process group (default false)
killasgroup=false ; SIGKILL the UNIX process group (def false)
;user=skywell ; setuid to this UNIX account to run the program
;redirect_stderr=true ; redirect proc stderr to stdout (default false)
stdout_logfile=/var/log/<服务名>.log ; stdout log path, NONE for none; default AUTO
stdout_logfile_maxbytes=1MB ; max # logfile bytes b4 rotation (default 50MB)
stdout_logfile_backups=1 ; # of stdout logfile backups (default 10)
stdout_capture_maxbytes=1MB ; number of bytes in ‘capturemode’ (default 0)
stdout_events_enabled=false ; emit events on stdout writes (default false)
stderr_logfile=/var/log/<服务名>.err ; stderr log path, NONE for none; default AUTO
stderr_logfile_maxbytes=1MB ; max # logfile bytes b4 rotation (default 50MB)
stderr_logfile_backups=10 ; # of stderr logfile backups (default 10)
stderr_capture_maxbytes=1MB ; number of bytes in ‘capturemode’ (default 0)
stderr_events_enabled=false ; emit events on stderr writes (default false)
environment=A=”1”,B=”2”,HOME=”/home/skywell” ; process environment additions (def no adds)
serverurl=AUTO ; override serverurl computation (childutils)

首先添加任务描述文件,在/etc/supervisor目录下新建文件sparkportal.conf, 将上面任务模板内容复制进文件sparkportal.conf中,将<服务名>替换为任务名sparkportal,将<启动命令>替换为node www.js,将<运行目录>替换为程序所在目录/usr/local/sparkportal/bin。

sparkportal的配置文件为

[program:sparkportal]
command=node www.js
process_name=%(program_name)s ; process_name expr (default %(program_name)s)
numprocs=1                    ; number of processes copies to start (def 1)
directory=/usr/local/sparkportal/bin                ; directory to cwd to before exec (def no cwd)
;umask=022                     ; umask for process (default None)
;priority=999                  ; the relative start priority (default 999)
autostart=true                ; start at supervisord start (default: true)
autorestart=unexpected        ; whether/when to restart (default: unexpected)
startsecs=1                   ; number of secs prog must stay running (def. 1)
startretries=3                ; max # of serial start failures (default 3)
exitcodes=0,2                 ; 'expected' exit codes for process (default 0,2)
stopsignal=QUIT               ; signal used to kill process (default TERM)
stopwaitsecs=10               ; max num secs to wait b4 SIGKILL (default 10)
stopasgroup=false             ; send stop signal to the UNIX process group (default false)
killasgroup=false             ; SIGKILL the UNIX process group (def false)
;user=skywell                  ; setuid to this UNIX account to run the program
;redirect_stderr=true          ; redirect proc stderr to stdout (default false)
stdout_logfile=/var/log/sparkportal.log        ; stdout log path, NONE for none; default AUTO
stdout_logfile_maxbytes=1MB   ; max # logfile bytes b4 rotation (default 50MB)
stdout_logfile_backups=1     ; # of stdout logfile backups (default 10)
stdout_capture_maxbytes=1MB   ; number of bytes in 'capturemode' (default 0)
stdout_events_enabled=false   ; emit events on stdout writes (default false)
stderr_logfile=/var/log/sparkportal.err        ; stderr log path, NONE for none; default AUTO
stderr_logfile_maxbytes=1MB   ; max # logfile bytes b4 rotation (default 50MB)
stderr_logfile_backups=10     ; # of stderr logfile backups (default 10)
stderr_capture_maxbytes=1MB   ; number of bytes in 'capturemode' (default 0)
stderr_events_enabled=false   ; emit events on stderr writes (default false)
environment=A="1",B="2",HOME="/home/skywell"       ; process environment additions (def no adds)
serverurl=AUTO                ; override serverurl computation (childutils)

##增加任务

supervisorctl update

# supervisorctl update
sparkportal: added process group

该命令会将sparkportal.conf所描述的任务启动并纳入管理。然后运用查看任务命令即可查看新增任务的运行状态,如若运行失败,可查看/usr/log目录下的相关日志分析原因。

##设置环境变量
在配置文件找到environment所在行,若没有没有最下面增加environment=变量名=”变量值”即可,如果多个环境变量用逗号分隔,例如environment=变量名1=”变量值1”,变量名2=”变量值2”。

将nodejs运行环境设置为生产环境,增加如下代码:
environment=NODE_ENV=production