In this article we will analyze the NFL play by play dataset. The data consists of each play for all games from 2002 thourgh 2013. It is roughly around 600k rows and hardly qualifies as big data. The main point of this article is to illustrate the use of Cloudera Impala for Big Data anlaysis. We will also see the comparison performance against Hive.
Check the complete analysis here http://www.infocaptor.com/dashboard/nfl-play-by-play-analysis-using-cloudera-impala