Skip to content
This repository was archived by the owner on Dec 11, 2022. It is now read-only.
This repository was archived by the owner on Dec 11, 2022. It is now read-only.

TimeDistributed LSTM Middleware #461

@OGordon100

Description

@OGordon100

For many real-world situations, the task may have hidden state or partially observable features, making the Markovian assumption only semi-valid.

One way around this is to use frame stacking - doable already in Coach with filters.observation.observation_stacking_filter. It may be even better to use LSTM (and bi-directional) LSTM. Agents for this already exist, with the very well cited DRQN being one of them.

In Coach currently, there is the LSTMMiddleware layer. However, from what I understand of the source code it runs along the observations axis (for inputs such as text). Tensorflow of course has the TimeDistributed wrapper (with return_sequences=True) to run LSTM along the temporal axis between transitions.

Could timedistributed LSTM be added as a middleware? (or at the very least "hacked" in, as it would be of immense benefit to my current research, which I am using with a simple behavioural cloning agent)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions