Text this: Spotting events in football videos with a combination of two-stream convolutional neural network and dilated recurrent neural network /