Toxic Comments Hunter : Score Severity of Toxic Comments
This addresses the problem of online toxicity for internet users, but it is incremental as it applies existing methods to new data.
The paper tackled the problem of detecting toxic comments by collecting datasets, performing data cleaning and feature extraction, and training models based on TFIDF and finetuned BERT, resulting in software for real-time scoring of toxic comments.
The detection and identification of toxic comments are conducive to creating a civilized and harmonious Internet environment. In this experiment, we collected various data sets related to toxic comments. Because of the characteristics of comment data, we perform data cleaning and feature extraction operations on it from different angles to obtain different toxic comment training sets. In terms of model construction, we used the training set to train the models based on TFIDF and finetuned the Bert model separately. Finally, we encapsulated the code into software to score toxic comments in real-time.