Textual analysis of End User License Agreement for red-flagging potentially malicious software
This addresses a security problem for end users by automating the detection of potentially harmful software through EULA analysis, though it is incremental as it applies existing text analysis methods to a new domain.
The paper tackled the problem of end users not reading End User License Agreements (EULAs) due to length and complexity, which can hide malicious terms, by proposing a solution that summarizes EULAs and classifies them as 'Benign' or 'Malicious' using ensemble learning with supervised classifiers and text summarization methods, achieving an accuracy of 95.8%.
New software and updates are downloaded by end users every day. Each dowloaded software has associated with it an End Users License Agreements (EULA), but this is rarely read. An EULA includes information to avoid legal repercussions. However,this proposes a host of potential problems such as spyware or producing an unwanted affect in the target system. End users do not read these EULA's because of length of the document and users find it extremely difficult to understand. Text summarization is one of the relevant solution to these kind of problems. This require a solution which can summarize the EULA and classify the EULA as "Benign" or "Malicious". We propose a solution in which we have summarize the EULA and classify the EULA as "Benign" or "Malicious". We extract EULA text of different sofware's then we classify the text using eight different supervised classifiers. we use ensemble learning to classify the EULA as benign or malicious using five different text summarization methods. An accuracy of $95.8$\% shows the effectiveness of the presented approach.