(C++ Template Library to Predict, Control, Learn Behaviors, and Represent Learnable Knowledge using On/Off Policy Reinforcement Learning)
RLLib is a lightweight C++ template library that implements incremental, standard, and gradient temporal-difference learning algorithms in Reinforcement Learning. It is an optimized library for robotic applications and embedded devices that operates under fast duty cycles (e.g., < 30 ms). RLLib has been tested and evaluated on RoboCup 3D soccer simulation agents,  physical NAO V4 humanoid robots, and Tiva C series launchpad microcontrollers  to predict, control, learn behaviors, and represent learnable knowledge.  The implementation of the RLLib library is inspired by the RLPark API, which is a library of temporal-difference learning algorithms written in Java.
- Off-policy prediction algorithms:
 GTD(λ)GTD(λ)TrueGQ(λ)- Off-policy control algorithms:
 Q(λ)Greedy-GQ(λ)Softmax-GQ(λ)Off-PAC (can be used in on-policy setting)- On-policy algorithms:
 TD(λ)TD(λ)AlphaBoundTD(λ)TrueSarsa(λ)Sarsa(λ)AlphaBoundSarsa(λ)TrueSarsa(λ)ExpectedActor-Critic (continuous actions, discrete actions, discounted reward settting, averaged reward settings, and so on)- Supervised learning algorithms:
 AdalineIDBDK1SemiLinearIDBDAutostep- Policies:
RandomRandomX%BiasGreedyEpsilon-greedyBoltzmannNormalSoftmax - Dot product: An efficient implementation of the dot product for tile coding based feature representations (with culling traces).
 - Benchmarking environments:
Mountain CarMountain Car 3DSwinging PendulumContinuous Grid WorldBicycleCart PoleAcrobotNon-Markov Pole BalancingHelicopter - Optimization:
Optimized for very fast duty cycles (e.g., with culling traces, RLLib has been tested on 
the Robocup 3D simulator agent, and onthe NAO V4 (cognition thread)). - Usage: The algorithm usage is very much similar to RLPark, therefore, swift learning curve.
 - Examples: There are a plethora of examples demonstrating on-policy and off-policy control experiments.
 - Visualization: We provide a Qt4 based application to visualize benchmark problems.
 
Open AI Gym is a toolkit for developing and comparing reinforcement learning algorithms. We have developed a bridge between Gym and RLLib to use all the functionalities provided by Gym, while writing the agents (on/off-policy) in RLLib. The directory, openai_gym, contains our bridge as well as RLLib agents that learn and control the classic control environments.
- 
Extension for Tiva C Series EK-TM4C123GXL LaunchPad, and Tiva C Series TM4C129 Connected LaunchPad microcontrollers.
 - 
Tiva C series launchpad microcontrollers: https://github.com/samindaa/csc688
 
RLLib is a C++ template library. The header files are located in the include directly. You can simply include/add this directory from your projects, e.g., -I./include, to access the algorithms.
To access the control algorithms:
#include "ControlAlgorithm.h"
To access the predication algorithms:
#include "PredictorAlgorithm"
To access the supervised learning algorithms:
#include "SupervisedAlgorithm.h"
RLLib uses the namespace:
using namespace RLLib
RLLib provides a flexible testing framework. Follow these steps to quickly write a test case.
- To access the testing framework: 
#include "HeaderTest.h" 
#include "HeaderTest.h"
RLLIB_TEST(YourTest)
class YourTest Test: public YourTestBase
{
  public:
    YourTestTest() {}
    virtual ~Test() {}
    void run();
  private:
    void testYourMethod();
};
void YourTestBase::testYourMethod() {/** Your test code */}
void YourTestBase::run() { testYourMethod(); }- Add 
YourTestto thetest/test.cfgfile. - You can use 
@YourTestto execute onlyYourTest. For example, if you need to execute only MountainCar test cases, use @MountainCarTest. 
We are using CMAKE >= 2.8.7 to build and run the test suite.
- mkdir build
 - cd build; cmake ..
 - make -j
 
RLLib provides a QT5 based Reinforcement Learning problems and algorithms visualization tool named RLLibViz. Currently RLLibViz visualizes following problems and algorithms:
- 
On-policy:
- SwingPendulum problem with continuous actions. We use AverageRewardActorCritic algorithm.
 
 - 
Off-policy:
- ContinuousGridworld and MountainCar problems with discrete actions. We use Off-PAC algorithm.
 
 - 
In order to run the visualization tool, you need to have QT4.8 installed in your system.
 - 
In order to install RLLibViz:
- Change directory to 
visualization/RLLibViz - qmake RLLibViz.pro
 - make -j
 - ./RLLibViz
 
 - Change directory to 
 
- Ubuntu >= 11.04
 - Windows (Visual Studio 2013)
 - Mac OS X
 
- Variable action per state.
 - Non-linear algorithms.
 - Deep learning algorithms.
 
- Dynamic Role Assignment using General ValueFunctions
 - Humanoid Robots and Spoken Dialog Systems for Brief Health Interventions
 
Saminda Abeyruwan, PhD ([email protected], [email protected])

