Human Interaction Recognition With Audio And Visual Cues