#reinforcement learning setup