meta_policy_search
latest

Contents:

  • Meta-Policy Search
    • Baselines
    • Environments
    • Meta-Algorithms
    • Optimizers
    • Policies
    • Samplers
    • Meta-Trainer
meta_policy_search
  • Docs »
  • Meta-Policy Search
  • Edit on GitHub

Meta-Policy Search¶

  • Baselines
    • Baseline (Interface)
    • Linear Feature Baseline
    • LinearTimeBaseline
  • Environments
    • MetaEnv (Interface)
  • Meta-Algorithms
    • MAML-Algorithm (Interface)
    • ProMP-Algorithm
    • TRPO-MAML-Algorithm
    • VPG-MAML-Algorithm
  • Optimizers
    • Conjugate Gradient Optimizer
    • MAML First Order Optimizer
  • Policies
    • Policy Interfaces
    • Gaussian-Policies
  • Samplers
    • Sampler
    • Sample Processor
    • Vectorized Environment Executor

Meta-Trainer¶

class meta_policy_search.meta_trainer.Trainer(algo, env, sampler, sample_processor, policy, n_itr, start_itr=0, num_inner_grad_steps=1, sess=None)[source]¶

Bases: object

Performs steps of meta-policy search.

Pseudocode:

for iter in n_iter:
    sample tasks
    for task in tasks:
        for adapt_step in num_inner_grad_steps
            sample trajectories with policy
            perform update/adaptation step
        sample trajectories with post-update policy
    perform meta-policy gradient step(s)
Parameters:
  • algo (Algo) –
  • env (Env) –
  • sampler (Sampler) –
  • sample_processor (SampleProcessor) –
  • baseline (Baseline) –
  • policy (Policy) –
  • n_itr (int) – Number of iterations to train for
  • start_itr (int) – Number of iterations policy has already trained for, if reloading
  • num_inner_grad_steps (int) – Number of inner steps per maml iteration
  • sess (tf.Session) – current tf session (if we loaded policy, for example)
get_itr_snapshot(itr)[source]¶

Gets the current policy and env for storage

train()[source]¶

Trains policy on env using algo

Pseudocode:

for itr in n_itr:
    for step in num_inner_grad_steps:
        sampler.sample()
        algo.compute_updated_dists()
    algo.optimize_policy()
    sampler.update_goals()
Next Previous

© Copyright 2018, Dennis Lee, Ignasi Clavera, Jonas Rothfuss Revision 93ae339e.

Built with Sphinx using a theme provided by Read the Docs.
Read the Docs v: latest
Versions
latest
Downloads
pdf
html
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.