Optimizing Hive Performance on Analyzing Log Data with Hadoop

L. K. Vishwamitra, Vijay Singh Pawar


Web servers automatically created or maintained the log files to observe the user behaviour. So, we can get the valuable information through analyzing these server log files. Log files collect a variety of data about information requests to your web server. Server logs act as a visitor sign-in sheet. Server log files can give information about what pages get the most and the least traffic? What sites refer visitors to your site? What pages that your visitors view? There is huge usage of web which generates a huge amount of log files and processing such huge amount of log files using RDBMS creates problem. In this we proposed Hadoop which is an open source solution for handling big data. For analysis we can use bigdata analytical tool called hive which works on top of the Hadoop. Hive support SQL like language to analyze the data called HiveQL. In this paper we can also enhance the performance of hive query on analyzing log data.

L.K. Vishwamitra, Vijay Singh Pawar. Optimizing Hive Performance on Analyzing Log Data with Hadoop. Journal of Communication Engineering & Systems. 2018; 8(3): 45–52p.


Hadoop, web mining, log data, bigdata, pattern analysis

