Exploring reinforcement learning for chatbot training