Neha Bansal
School of Computer Science and Engineering, Geeta University, Haryana, India
Publications
-
Research Article
Automating Multilingual SDG Event Extraction from Regional Portals Using Web Scraping and LangChain Frameworks
Author(s): Bhawna Singla* and Neha Bansal
This research presents a novel, scalable, and multilingual data extraction framework designed specifically to collect and structure Sustainable Development Goal (SDG) event information from a wide range of regional portals and language-specific SDG websites. As SDG-related activities are increasingly being organized and reported by diverse stakeholders across the globe—ranging from local governments to international NGOs—event data is often dispersed across decentralized platforms, published in different languages, and presented in unstructured or semi-structured formats. Traditional data collection methods struggle to keep up with the volume, variability, and linguistic diversity of such data sources. To address these challenges, this study leverages a hybrid approach that combines web scraping techniques with the LangChain framework, which allows seamless integ.. Read More»

