Bandit Learning for Sequential Decision Making : A Practical Way to Address the Trade-Off Between Exploration and Exploitation