The zenticket daemon

How does it work?

zenticket performs the following procedures each cycle:

The ticket create script performs the following:


How is it configured?

The zenticket daemon is configured via the zenticket.conf file. This file is editable by logging in to the Zenoss server via SSH, becoming the zenoss user, and then editing $ZENHOME/etc/zenticket.conf. It is also editable via the Zenoss user interface by logging in and navigating to Settings → Daemons then selecting “view config” for zenticket, then selecting “edit this configuration”.

The zenticket.conf file looks like the following:

# Examples:
#
# Single client Zenoss server:
#
# [CUSTOMER NAME]
# id:cust-00000
# queue:Front Line
# group1:None
# group2:/Network
# group3:/Security
# group4:/Server
# group5:/Up-Down
#
# Multi-client Zenoss server:
#
# [CUSTOMER NAME]
# id:cust-00000
# queue:Front Line
# customer=CUSTOMER NAME
# group1:/%(customer)s
# group2:/%(customer)s/Network
# group3:/%(customer)s/Security
# group4:/%(customer)s/Server
# group5:/%(customer)s/Up-Down

[GENERAL]
ticketscript:/home/zenoss/zenticket/create_ticket.pl
cycletime:30

[LAB]
id:nnl-00000
queue:Front Line
group1:None
group2:/Network
group3:/Security
group4:/Server
group5:/Up-Down

As you can see in the examples which are commented out in the config, the config file can be used for Zenoss servers which are dedicated to a single client, as well as Zenoss servers which monitor multiple clients. The path to the ticket create script, the cycle time of the daemon, customer id, queue, and customer groups can all be defined in the zenticket.conf file.

In this case the path to the ticket script is set to /home/zenoss/zenticket/create_ticket.pl, the cycle time is set to 30 seconds, and the client LAB has been given a customer id of nnl-00000 and a queue of Front Line. There are also groups defined for this client. If an event comes in for a device in any of the device groups listed the ticket will be generated in the Front Line queue with a customer id of nnl-00000.

After the config file for the daemon is edited the daemon must be restarted so that it picks up on the new configuration. This can be done in two ways. The first is by logging in to the Zenoss server via SSH, becoming the zenoss user, and executing “zenticket restart”. The second is by logging in to the Zenoss user interface, navigating to Settings → Daemons and clicking on “Restart” for zenticket.


Where does it log to?

The zenticket daemon currently sends log info to $ZENHOME/log/zenticket.log. This log file is either viewable via SSH or via the user interface by logging in and navigating to Settings → Daemons and clicking “view log” for zenticket.

At the moment logging is fairly limited, but I do intend to implement more detailed logging in a future version of the daemon.

Example log file output:

2009-11-18 16:45:33 INFO zen.zenticket: ticket create script ran 1 time
2009-11-18 16:46:04 INFO zen.zenticket: ticket create script ran 1 time
2009-11-18 16:46:35 INFO zen.zenticket: ticket create script ran 1 time
2009-11-18 16:47:10 INFO zen.zenticket: ticket create script ran 6 times
2009-11-18 16:47:55 INFO zen.zenticket: ticket create script ran 1 time
2009-11-18 16:48:26 INFO zen.zenticket: ticket create script ran 1 time
2009-11-18 16:48:58 INFO zen.zenticket: ticket create script ran 1 time
2009-11-18 16:49:29 INFO zen.zenticket: ticket create script ran 1 time
2009-11-18 16:50:00 INFO zen.zenticket: ticket create script ran 1 time
2009-11-18 16:50:32 INFO zen.zenticket: ticket create script ran 1 time
2009-11-18 16:51:03 INFO zen.zenticket: ticket create script ran 1 time
2009-11-18 16:51:34 INFO zen.zenticket: ticket create script ran 1 time
2009-11-18 16:52:06 INFO zen.zenticket: ticket create script ran 2 times
2009-11-18 16:53:19 INFO zen.zenticket: Deleting PID file /usr/local/zenoss/zenoss/var/zenticket-localhost.pid ...
2009-11-18 16:53:19 INFO zen.zenticket: zenticket shutting down
2009-11-18 16:53:43 INFO zen.zenticket: Starting zenticket
2009-11-18 16:54:19 INFO zen.zenticket: ticket create script ran 1 time
2009-11-18 16:54:50 INFO zen.zenticket: ticket create script ran 1 time
2009-11-18 16:55:22 INFO zen.zenticket: ticket create script ran 1 time
2009-11-18 16:55:53 INFO zen.zenticket: ticket create script ran 1 time
2009-11-18 16:56:24 INFO zen.zenticket: ticket create script ran 1 time
2009-11-18 16:56:56 INFO zen.zenticket: ticket create script ran 1 time
2009-11-18 16:57:27 INFO zen.zenticket: ticket create script ran 1 time
2009-11-18 16:57:58 INFO zen.zenticket: ticket create script ran 1 time
2009-11-18 16:58:29 INFO zen.zenticket: ticket create script ran 1 time
2009-11-18 16:59:01 INFO zen.zenticket: ticket create script ran 1 time
2009-11-18 16:59:32 INFO zen.zenticket: ticket create script ran 1 time
2009-11-18 17:00:09 INFO zen.zenticket: ticket create script ran 6 times
2009-11-18 17:00:46 INFO zen.zenticket: ticket create script ran 1 time
2009-11-18 17:01:30 INFO zen.zenticket: ticket create script ran 1 time
2009-11-18 17:02:02 INFO zen.zenticket: ticket create script ran 1 time
2009-11-18 17:02:33 INFO zen.zenticket: ticket create script ran 1 time
2009-11-18 17:03:05 INFO zen.zenticket: ticket create script ran 1 time

The code

Here is the current code for the zenticket daemon (it is written in python):

#!/usr/bin/env python

from daemon import Daemon
import os, sys

pidfile = os.path.join(os.environ['ZENHOME'], 'var/zenticket-localhost.pid')
zenconfpath = os.path.join(os.environ['ZENHOME'], 'etc/zenticket.conf')
logfile = os.path.join(os.environ['ZENHOME'], 'log/zenticket.log')

import logging

for handler in logging.root.handlers[:]:
    logging.root.removeHandler(handler)

logging.basicConfig(level=logging.INFO,
        format='%(asctime)s %(levelname)s zen.zenticket: %(message)s',
        datefmt='%Y-%m-%d %H:%M:%S',
        filename=logfile,
        filemode='a')

class MyDaemon(Daemon):
    def run(self):
        import Globals
        from Products.ZenUtils.ZenScriptBase import ZenScriptBase
        from transaction import commit

        dmd = ZenScriptBase(connect=True).dmd

        from Products.ZenUtils import Time
        from subprocess import call
        from MySQLdb import OperationalError
        import time, socket, re, subprocess, ConfigParser, datetime

        config = ConfigParser.ConfigParser()
        config.read([zenconfpath])

        ticketscript = config.get("GENERAL", "ticketscript")
        cycletime = config.get("GENERAL", "cycletime")

        events = {}

        for handler in logging.root.handlers[:]:
            logging.root.removeHandler(handler)

        logging.basicConfig(level=logging.INFO,
                format='%(asctime)s %(levelname)s zen.zenticket: %(message)s',
                datefmt='%Y-%m-%d %H:%M:%S',
                filename=logfile,
                filemode='a')

        sys.stderr = open(logfile, 'a')

        while True:
            ticketscreated=0
            evt = 0
            delevents = []
            for k, v in events.iteritems():
                eventmatch = 0
                for e in dmd.ZenEventManager.getEventList([], "", "lastTime ASC, firstTime ASC"):
                    if re.match(k, e.evid):
                        eventmatch = 1
                if eventmatch == 0:
                    delevents.append(k)

            for e in delevents:
                del events[e]

            for e in dmd.ZenEventManager.getEventList([], "", "lastTime ASC, firstTime ASC"):
                evt = None
                create = 0
                if e.evid in events:
                    if e.count > events[e.evid]:
                        create = 1
                        events[e.evid] = e.count
                        evt = dmd.ZenEventManager.getEventDetailFromStatusOrHistory(e.evid)
                else:
                    evt = dmd.ZenEventManager.getEventDetailFromStatusOrHistory(e.evid)
                    if evt.eventState == 0:
                        create = 1
                        events[e.evid] = e.count
                    else:
                        events[e.evid] = e.count

                if evt != None:
                    if create == 1:
                        ticket = None
                        p = None

                        if not re.match("Command timed out on device", evt.message) and evt.prodState == 1000 and evt.severity >= 3 and \
                            not re.match("Unknown", evt.summary) and not re.match("/Discovered", evt.DeviceClass):

                            groupmatch = 0

                            for s in config.sections():
                                if not re.match('GENERAL', s):
                                    groups = evt.DeviceGroups
                                    groups = groups.split('|')
                                    for g in groups:
                                        if not re.match('^$', str(e)):
                                            if re.match(str(config.get(s, "group1")), str(g)) or re.match(str(config.get(s, "group2")), str(g)) or \
                                                re.match(str(config.get(s, "group3")), str(g)) or re.match(str(config.get(s, "group4")), str(g)) or \
                                                re.match(str(config.get(s, "group5")), str(g)):

                                                groupmatch = 1
                                                custid = config.get(s, "id")
                                                queue = config.get(s, "queue")

                            if groupmatch == 1:
                                if os.path.exists(str(ticketscript)):
                                    p = subprocess.Popen([str(ticketscript), '-customer', str(custid), '-device', 
                                                                str(evt.device), '-deviceIP', str(evt.ipAddress), '-collector', socket.getfqdn(), 
                                                                '-first', str(evt.firstTime), '-last', str(evt.lastTime), '-count', str(evt.count), 
                                                                '-summary', str(evt.summary), '-noteTitle', 'System Monitor Error', '-note', 
                                                                str(evt.message), '-severity', str(evt.severity), '-group', str(evt.DeviceGroups), 
                                                                '-impact', str(evt.DevicePriority), '-component', str(evt.component), 
                                                                '-queue', str(queue)], stdout=subprocess.PIPE)

                            if p:
                                ticket = int(p.stdout.read())
                            else:
                                ticket = 0

                            if ticket > 0:
                                ticketscreated = ticketscreated + 1
                                eventli = [evt.evid]
                                try:
                                    dmd.ZenEventManager.manage_setEventStates(1, eventli)
                                except OperationalError, e:
                                    if e[0] == 1205:
                                        pass
                                    elif e[0] == 1213:
                                        pass
                                    elif e[0] == 1422:
                                        pass
                                    elif e[0] == 1206:
                                        pass
                                    elif e[0] == 2002:
                                        pass
                                    else:
                                        raise

                        else:
                            if not re.match('/Status/Ping', evt.eventClass) and not re.match('SNMP agent down', evt.summary) and not evt.severity == 1:
                                try:
                                    dmd.ZenEventManager.manage_deleteEvents(evt.evid)
                                except OperationalError, e:
                                    if e[0] == 1205:
                                        pass
                                    elif e[0] == 1213:
                                        pass
                                    elif e[0] == 1422:
                                        pass
                                    elif e[0] == 1206:
                                        pass
                                    elif e[0] == 2002:
                                        pass
                                    else:
                                        raise

            if ticketscreated > 0:
                if ticketscreated == 1:
                    logging.info('ticket create script ran %s time', ticketscreated)
                else:
                    logging.info('ticket create script ran %s times', ticketscreated)

            time.sleep(int(cycletime))


if __name__ == "__main__":
	daemon = MyDaemon(pidfile)
	if len(sys.argv) == 2:
		if 'start' == sys.argv[1]:
                        if os.path.exists(zenconfpath):
                            logging.info('Starting zenticket')
                            daemon.start()
                        else:
                            print '%s is missing, aborting start.' % (zenconfpath)
		elif 'stop' == sys.argv[1]:
                        if os.path.exists(pidfile):
                            logging.info('Deleting PID file %s ...', pidfile)
                            logging.info('zenticket shutting down')
                        print 'stopping...'
			daemon.stop()
		elif 'restart' == sys.argv[1]:
                        if os.path.exists(pidfile):
                            logging.info('Deleting PID file %s ...', pidfile)
                            logging.info('zenticket shutting down')
                        print 'stopping...'
			daemon.stop()
                        if os.path.exists(zenconfpath):
                            logging.info('Starting zenticket')
                            daemon.start()
                        else:
                            print '%s is missing, aborting start.' % (zenconfpath)
                elif 'status' == sys.argv[1]:
                        try:
                            pf = file(daemon.pidfile,'r')
                            pid = int(pf.read().strip())
                            pf.close()
                        except IOError:
                            pid = None

                        def check_pid(pid):
                            """ Check For the existence of a unix pid. """
                            try:
                                os.kill(pid, 0)
                            except OSError:
                                return False
                            else:
                                return True

                        if not pid:
                            print 'not running'
                        else:
                            if check_pid(pid) == True:
                                print 'program running; pid=%s' % (pid)
                            else:
                                print 'not running'

		else:
			print "usage: zenticket start|stop|restart|status"
			sys.exit(2)
		sys.exit(0)
	else:
		print "usage: zenticket start|stop|restart|status"
		sys.exit(2)

Here is the daemon script that the zenticket script uses to daemonize itself:

#!/usr/bin/env python

import sys, os, time, atexit, signal
from signal import SIGTERM 

class Daemon:
	"""
	A generic daemon class.
	
	Usage: subclass the Daemon class and override the run() method
	"""
	def __init__(self, pidfile, stdin='/dev/null', stdout='/dev/null', stderr='/dev/null'):
		self.stdin = stdin
		self.stdout = stdout
		self.stderr = stderr
		self.pidfile = pidfile

	def daemonize(self):
		"""
		do the UNIX double-fork magic, see Stevens' "Advanced 
		Programming in the UNIX Environment" for details (ISBN 0201563177)
		http://www.erlenstar.demon.co.uk/unix/faq_2.html#SEC16
		"""
		try: 
			pid = os.fork() 
			if pid > 0:
				# exit first parent
				sys.exit(0) 
		except OSError, e: 
			sys.stderr.write("fork #1 failed: %d (%s)\n" % (e.errno, e.strerror))
			sys.exit(1)
	
		# decouple from parent environment
		os.chdir("/") 
		os.setsid() 
		os.umask(0) 
	
		# do second fork
		try: 
			pid = os.fork() 
			if pid > 0:
				# exit from second parent
				sys.exit(0) 
		except OSError, e: 
			sys.stderr.write("fork #2 failed: %d (%s)\n" % (e.errno, e.strerror))
			sys.exit(1) 
	
		# redirect standard file descriptors
		sys.stdout.flush()
		sys.stderr.flush()
		si = file(self.stdin, 'r')
		so = file(self.stdout, 'a+')
		se = file(self.stderr, 'a+', 0)
		os.dup2(si.fileno(), sys.stdin.fileno())
		os.dup2(so.fileno(), sys.stdout.fileno())
		os.dup2(se.fileno(), sys.stderr.fileno())
	
		# write pidfile
		atexit.register(self.delpid)
		pid = str(os.getpid())
		file(self.pidfile,'w+').write("%s\n" % pid)
	
	def delpid(self):
		os.remove(self.pidfile)

	def start(self):
		"""
		Start the daemon
		"""

                def check_pid(pid):
                    """ Check For the existence of a unix pid. """
                    try:
                        os.kill(pid, 0)
                    except OSError:
                        return False
                    else:
                        return True

		# Check for a pidfile to see if the daemon already runs
		try:
			pf = file(self.pidfile,'r')
			pid = int(pf.read().strip())
			pf.close()
		except IOError:
			pid = None
	
		if pid:
                        if check_pid(pid) == True:
			    message = "is already running\n"
			    sys.stderr.write(message)
			    sys.exit(1)
                        else:
                            pf = file(self.pidfile,'r')
                            pid = int(pf.read().strip())
                            pf.close()
		
		# Start the daemon
                message = "starting...\n"
                sys.stderr.write(message)
		self.daemonize()
		self.run()

	def stop(self):
		"""
		Stop the daemon
		"""
		# Get the pid from the pidfile
		try:
			pf = file(self.pidfile,'r')
			pid = int(pf.read().strip())
			pf.close()
		except IOError:
			pid = None
	
		if not pid:
			message = "already stopped\n"
			sys.stderr.write(message)
			return # not an error in a restart

		# Try killing the daemon process	
		try:
			while 1:
				os.kill(pid, SIGTERM)
				time.sleep(0.1)
		except OSError, err:
			err = str(err)
			if err.find("No such process") > 0:
				if os.path.exists(self.pidfile):
					os.remove(self.pidfile)
			else:
				print str(err)
				sys.exit(1)

	def restart(self):
		"""
		Restart the daemon
		"""
		self.stop()
		self.start()

	def run(self):
		"""
		You should override this method when you subclass Daemon. It will be called after the process has been
		daemonized by start() or restart().
		"""