MCMM-LSW: MULTILEVEL CONTENT MINING MODEL FOR LARGE SCALE WEBSITES
Conference: Fifth International Conference on Advances in Information Technology and Mobile Communication
AbstractAs per current usage of WWW, the data available over the websites is growing at a large scale hence efficient web data extraction approaches has become a great challenge for large scale websites. The main requirement of such websites is to extract the efficient and accurate data in sufficient amount of time. This paper proposes a Web content extraction model for extracting content from large scale websites. The proposed Model (MCMM-LSW) produces a link tree of website and extracts content based on the seed page extracted from different levels of link tree. The results produce higher recall, precision and overall accuracy (F-measure) than the techniques used in the literature. The effect of changing number of levels of the website is also shown in results. Finally the comparison of keyword based search and proposed approach is also shown. |
AIM - 2015![]() |