IROct 18, 2014

Penerapan teknik web scraping pada mesin pencari artikel ilmiah

Ahmad Josi, Leon Andretti Abdillah, Suryayusra

arXiv:1410.5777v146 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the need for automated data retrieval from scientific databases, but it appears incremental as it applies existing web scraping methods to specific search engines without introducing new paradigms.

The paper tackles the problem of extracting information from scientific article search engines by applying web scraping techniques, resulting in a method that mimics web applications to collect data from free sources like Portal Garuda, ISJD, and Google Scholar.

Search engines are a combination of hardware and computer software supplied by a particular company through the website which has been determined. Search engines collect information from the web through bots or web crawlers that crawls the web periodically. The process of retrieval of information from existing websites is called "web scraping." Web scraping is a technique of extracting information from websites. Web scraping is closely related to Web indexing, as for how to develop a web scraping technique that is by first studying the program makers HTML document from the website will be taken to the information in the HTML tag flanking the aim is for information collected after the program makers learn navigation techniques on the website information will be taken to a web application mimicked the scraping that we will create. It should also be noted that the implementation of this writing only scraping involves a free search engine such as: portal garuda, Indonesian scientific journal databases (ISJD), google scholar.

View on arXiv PDF

Similar