Analysis of Speeches in Indian Parliamentary Debates
This work addresses the need for structured linguistic analysis of parliamentary data, specifically for Indian debates, but is incremental as it builds on existing work in stance classification and pragmatics.
The authors tackled the problem of unstructured parliamentary debate data by creating a dataset for Indian parliamentary debates and performing stance classification to identify support or opposition to bills, achieving promising results in automated classification of speech purposes into four categories.
With the increasing usage of the internet, more and more data is being digitized including parliamentary debates but they are in an unstructured format. There is a need to convert them into a structured format for linguistic analysis. Much work has been done on parliamentary data such as Hansard, American congressional floor-debate data on various aspects but less on pragmatics. In this paper, we provide a dataset for the synopsis of Indian parliamentary debates and perform stance classification of speeches i.e identifying if the speaker is supporting the bill/issue or against it. We also analyze the intention of the speeches beyond mere sentences i.e pragmatics in the parliament. Based on thorough manual analysis of the debates, we developed an annotation scheme of 4 mutually exclusive categories to analyze the purpose of the speeches: to find out ISSUES, to BLAME, to APPRECIATE and for CALL FOR ACTION. We have annotated the dataset provided, with these 4 categories and conducted preliminary experiments for automatic detection of the categories. Our automated classification approach gave us promising results.