On this page
tf.compat.v1.train.MonitoredSession
Session-like object that handles initialization, recovery and hooks.
tf.compat.v1.train.MonitoredSession(
    session_creator=None, hooks=None, stop_grace_period_secs=120
)
  Example usage:
saver_hook = CheckpointSaverHook(...)
summary_hook = SummarySaverHook(...)
with MonitoredSession(session_creator=ChiefSessionCreator(...),
                      hooks=[saver_hook, summary_hook]) as sess:
  while not sess.should_stop():
    sess.run(train_op)
  Initialization: At creation time the monitored session does following things in given order:
- calls 
hook.begin()for each given hook - finalizes the graph via 
scaffold.finalize() - create session
 - initializes the model via initialization ops provided by 
Scaffold - restores variables if a checkpoint exists
 - launches queue runners
 - calls 
hook.after_create_session() 
Run: When run() is called, the monitored session does following things:
- calls 
hook.before_run() - calls TensorFlow 
session.run()with merged fetches and feed_dict - calls 
hook.after_run() - returns result of 
session.run()asked by user - if 
AbortedErrororUnavailableErroroccurs, it recovers or reinitializes the session before executing the run() call again 
Exit: At the close(), the monitored session does following things in order:
- calls 
hook.end() - closes the queue runners and the session
 - suppresses 
OutOfRangeerror which indicates that all inputs have been processed if the monitored_session is used as a context 
How to set tf.compat.v1.Session arguments:
- In most cases you can set session arguments as follows:
 
MonitoredSession(
  session_creator=ChiefSessionCreator(master=..., config=...))
  - In distributed setting for a non-chief worker, you can use following:
 
MonitoredSession(
  session_creator=WorkerSessionCreator(master=..., config=...))
  See MonitoredTrainingSession for an example usage based on chief or worker.
Note: This is not a tf.compat.v1.Session. For example, it cannot do following:
  
  - it cannot be set as default session.
 - it cannot be sent to saver.save.
 - it cannot be sent to tf.train.start_queue_runners.
 
| Args | |
|---|---|
session_creator | 
      A factory object to create session. Typically a ChiefSessionCreator which is the default one. | 
     
hooks | 
      An iterable of `SessionRunHook' objects. | 
| Returns | |
|---|---|
| A MonitoredSession object. | 
| Args | |
|---|---|
session_creator | 
      A factory object to create session. Typically a ChiefSessionCreator or a WorkerSessionCreator. | 
     
hooks | 
      An iterable of SessionRunHook' objects. </td> </tr><tr> <td>should_recover</td> <td> A bool. Indicates whether to recover fromAbortedErrorandUnavailableErroror not. </td> </tr><tr> <td>stop_grace_period_secs</td> <td> Number of seconds given to threads to stop afterclose()` has been called. | 
     
| Attributes | |
|---|---|
graph | 
      The graph that was launched in this session. | 
Child Classes
Methods
close
  
  close()
  run
  
  run(
    fetches, feed_dict=None, options=None, run_metadata=None
)
  Run ops in the monitored session.
This method is completely compatible with the tf.Session.run() method.
| Args | |
|---|---|
fetches | 
      Same as tf.Session.run(). | 
     
feed_dict | 
      Same as tf.Session.run(). | 
     
options | 
      Same as tf.Session.run(). | 
     
run_metadata | 
      Same as tf.Session.run(). | 
     
| Returns | |
|---|---|
Same as tf.Session.run(). | 
     
run_step_fn
  
  run_step_fn(
    step_fn
)
  Run ops using a step function.
| Args | |
|---|---|
step_fn | 
      A function or a method with a single argument of type StepContext. The function may use methods of the argument to perform computations with access to a raw session. The returned value of the step_fn will be returned from run_step_fn, unless a stop is requested. In that case, the next should_stop call will return True. Example usage:  Hooks interact with the   | 
     
| Returns | |
|---|---|
Returns the returned value of step_fn. | 
     
| Raises | |
|---|---|
StopIteration | 
      if step_fn has called request_stop(). It may be caught by with tf.MonitoredSession() to close the session. | 
     
ValueError | 
      if step_fn doesn't have a single argument called step_context. It may also optionally have self for cases when it belongs to an object. | 
     
should_stop
  
  should_stop()
  __enter__
  
  __enter__()
  __exit__
  
  __exit__(
    exception_type, exception_value, traceback
)
 © 2020 The TensorFlow Authors. All rights reserved.
Licensed under the Creative Commons Attribution License 3.0.
Code samples licensed under the Apache 2.0 License.
 https://www.tensorflow.org/versions/r2.3/api_docs/python/tf/compat/v1/train/MonitoredSession