Pre-processing of Domain Ontology Graph Generation System in Punjabi
This work addresses the challenge of processing Punjabi text for ontology generation, which is incremental as it applies standard pre-processing techniques to a specific language domain.
The paper tackles the problem of generating ontology graphs from Punjabi text documents by focusing on the pre-processing phase, which involves structuring the input text through steps like removing special symbols, duplicate terms, and stop words, and extracting terms using dictionary and gazetteer lists.
This paper describes pre-processing phase of ontology graph generation system from Punjabi text documents of different domains. This research paper focuses on pre-processing of Punjabi text documents. Pre-processing is structured representation of the input text. Pre-processing of ontology graph generation includes allowing input restrictions to the text, removal of special symbols and punctuation marks, removal of duplicate terms, removal of stop words, extract terms by matching input terms with dictionary and gazetteer lists terms.