Project

General

Profile

Actions

Bug #3589

closed

time-slider crashes

Added by Pavel Cahyna over 9 years ago. Updated over 8 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Desktop (JDS)
Target version:
-
Start date:
2013-02-22
Due date:
2014-02-06
% Done:

100%

Estimated time:
2.00 h
Difficulty:
Bite-size
Tags:
python

Description

When the pool usage exceeds 80%, timeslider crashes:

# fmadm faulty -a
--------------- ------------------------------------  -------------- ---------
TIME            EVENT-ID                              MSG-ID         SEVERITY
--------------- ------------------------------------  -------------- ---------
Feb 20 09:43:59 acee7651-d336-4854-c6e3-9f7d2dcb8cbd  SMF-8000-YX    major     

Host        : nfsserv1
Platform    : X8DAH     Chassis_id  : 1234567890
Product_sn  : 

Fault class : defect.sunos.smf.svc.maintenance
Affects     : svc:///application/time-slider:default
                  ok and in service
Problem in  : svc:///application/time-slider:default
                  repair attempted

Description : A service failed - the instance is restarting too quickly.
              Refer to http://illumos.org/msg/SMF-8000-YX for more information.

Response    : The service has been placed into the maintenance state.

Impact      : svc:/application/time-slider:default is unavailable.

Action      : Run 'svcs -xv svc:/application/time-slider:default' to determine
              the generic reason why the service failed, the location of any
              logfiles, and a list of other services impacted.

The log /var/svc/log/application-time-slider:default.log says:

[ Feb 20 09:43:55 Executing start method ("/lib/svc/method/time-slider start"). ]
Warning level value is:   80%
Critical level value is:  90%
Emergency level value is: 95%
ZPool name: compass
        Health: ONLINE
        Used: 780873Mb
        Available: 154550Mb
        Capacity: 83.4779819766%
ZPool name: rpool
        Health: ONLINE
        Used: 31276Mb
        Available: 248947Mb
        Capacity: 11.1612781482%
Last monthly snapshot was: compass/home/urban@zfs-auto-snap_monthly-2013-02-05-19h38
Recalculating monthly schedule
Last weekly snapshot was: compass/home/cahyna@zfs-auto-snap_weekly-2013-02-12-19h38
Recalculating weekly schedule
Recalculating daily schedule
Recalculating hourly schedule
Last frequent snapshot was: compass/home/cahyna@zfs-auto-snap_frequent-2013-02-19-17h53
Recalculating frequent schedule
Found disabled plugin:  %ssvc:/application/time-slider/plugin:rsync
Found disabled plugin:  %ssvc:/application/time-slider/plugin:zfs-send
[ Feb 20 09:43:57 Method "start" exited with status 0. ]
Auto excluding rpool/dump volume
Auto excluding rpool/swap volume
compass needs a cleanup
Performing warning level cleanup on compass
compass pool status after cleanup:
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib/python2.6/threading.py", line 525, in __bootstrap_inner
    self.run()
  File "/usr/lib/../share/time-slider/lib/time_slider/timesliderd.py", line 136, in run
    self._perform_cleanup()
  File "/usr/lib/../share/time-slider/lib/time_slider/timesliderd.py", line 676, in _perform_cleanup
    util.debug(zpool, self.verbose)
  File "/usr/lib/../share/time-slider/lib/time_slider/util.py", line 61, in debug
    syslog.syslog(syslog.LOG_NOTICE, message + '\\n')
TypeError: unsupported operand type(s) for +: 'instance' and 'str'

Snapshot monitor thread exited.
Snapshot monitor thread exited abnormally
Exit code: 0
[ Feb 20 09:43:59 Stopping because all processes in service exited. ]
[ Feb 20 09:43:59 Executing stop method (:kill). ]
[ Feb 20 09:43:59 Restarting too quickly, changing state to maintenance. ]

Code lines mentioned in the backtrace:
/usr/lib/../share/time-slider/lib/time_slider/timesliderd.py:

    668             # Bad - there's no more snapshots left and nothing 
    669             # left to delete. We don't disable the service since
    670             # it will permit self recovery and snapshot
    671             # retention when space becomes available on
    672             # the pool (hopefully).
    673             util.debug("%s pool status after cleanup:" \\
    674                        % zpool.name, \\
    675                        self.verbose)
--->676             util.debug(zpool, self.verbose)

/usr/lib/../share/time-slider/lib/time_slider/util.py:

     53 def debug(message, verbose):
     54     """ 
     55     Prints message out to standard error and syslog if
     56     verbose = True.
     57     Note that the caller needs to first establish a syslog
     58     context using syslog.openlog()
     59     """ 
     60     if verbose:
---->61         syslog.syslog(syslog.LOG_NOTICE, message + '\\n')
     62         sys.stderr.write(message + '\\n')

Note that verbose is set:

# svcprop svc:/application/time-slider:default
daemon/verbose boolean true

Actions #1

Updated by Pavel Cahyna over 9 years ago

I tried to set daemon/verbose to false and it does not solve the problem.

Actions #2

Updated by Ken Mays about 9 years ago

  • Assignee set to OI JDS
Actions #3

Updated by Ken Mays over 8 years ago

  • Due date set to 2014-02-06
  • Category set to Desktop (JDS)
  • Status changed from New to Closed
  • % Done changed from 0 to 100
  • Estimated time set to 2.00 h
  • Tags changed from needs-triage to python

Fixed. Apparently, a bug in python build caused this issue. I maxed to ZFS pools above 85%-99% and time-slider seems stable with current builds against Python 2.6.9/2.7.5. Reopen if this reappears on your machine after updating to oi_151a9/hipster releases.

Actions

Also available in: Atom PDF