Smart Chatbot Design to Support PDF Document Analysis at Politeknik Indonusa Surakarta

Ian Hanindya, Dwi Iskandar, Frestiany Regina Putri

Abstract


The increasing volume of academic documents in PDF format presents challenges for students, lecturers, and academic staff in quickly accessing specific information. This study proposes the design and development of an intelligent chatbot system that facilitates semantic analysis of academic PDF documents at Politeknik Indonusa Surakarta. The system integrates Natural Language Processing (NLP) techniques and a Large Language Model (LLM), specifically GPT-4, using the Langchain framework to interpret user queries and deliver context-aware responses. The Research and Development (R&D) methodology was applied using a 4D model: Define, Design, Develop, and Disseminate. A prototype was developed with capabilities such as extracting content, summarizing sections, and answering user queries based on uploaded academic PDFs. Functional and usability testing were conducted using real academic documents. The results indicate high response accuracy (90%) and strong user satisfaction (score: 4.5/5), validating the system’s performance. The chatbot demonstrated its ability to support academic services by improving access to unstructured knowledge and streamlining information retrieval. Despite its potential, the system also faces challenges including PDF structure variations, dependency on third-party APIs, and the need for data privacy safeguards. This research provides a foundation for future implementations of AI-powered educational tools, suggesting further development such as multilingual support, voice interaction, and institutional integration.

Full Text:

PDF

References


C. L. Andesti, R. Dian, A. B. Wahabbi, and M. H. Harlyn, “Mechatbot : Artificial Intelligence Chatbots as a Service Solution,” vol. 4, no. 2, pp. 206–217, 2023.

N. Mamuriyah, H. Haeruddin, and J. Jackson, “Developing a Chatbot System for PT. NG Tech Supplies based on the Python Flask Framework,” J. Teknol. Dan Sist. Inf. Bisnis, vol. 7, no. 1, pp. 143–149, 2025, doi: 10.47233/jteksis.v7i1.1821.

Y. Chen et al., “TDR: Task-Decoupled Retrieval with Fine-Grained LLM Feedback for In-Context Learning,” 2025.

D. Xu et al., “Large language models for generative information extraction: a survey,” Front. Comput. Sci., vol. 18, no. 6, pp. 1–47, 2024, doi: 10.1007/s11704-024- 40555-y.

V. N. S. Gandha, “Conversational Ai for Natural Language Data Analytics,” Int. J. Res. Comput. Appl. Inf. Technol., vol. 8, no. 1, pp. 1538–1550, 2025, doi: 10.34218/ijrcait_08_01_113.

J. O. Alotaibi and A. S. Alshahre, “The role of conversational AI agents in providing support and social care for isolated individuals,” Alexandria Eng. J., vol. 108, no. May, pp. 273–284, 2024, doi:

1016/j.aej.2024.07.098.

A. A. Khan, M. T. Hasan, K. K. Kemell, J. Rasku, and

P. Abrahamsson, “Developing Retrieval Augmented Generation (RAG) based LLM Systems from PDFs: An Experience Report,” pp. 1–36, 2024.

R. Tanaka, T. Iki, T. Hasegawa, K. Nishida, K. Saito, and J. Suzuki, “VDocRAG: Retrieval-Augmented Generation over Visually-Rich Documents,” 2025.

L. S. Hartono, E. I. Setiawan, and V. Singh, “Retrieval Augmented Generation-Based Chatbot for Prospective and Current University Students,” Int. J. Eng. Sci. Inf. Technol., vol. 5, no. 3, pp. 268–277, 2025, doi: 10.52088/ijesty.v5i3.951.

N. Sitohang, “Jurnal Sains Informatika Terapan ( JSIT

),” Appl. Data Min. Flood Early Warn. Using K-Means Clust. Method, vol. 2, no. 1, pp. 16–20, 2023.

N. I. HL, N. Nasruddin, A. E. Sejati, and A. Sugiarto, “Developing Teaching Material of Research Methodology and Learning with 4D Model in Facilitating Learning During the Covid-19 Pandemic to Improve Critical Thinking Skill,” J. Kependidikan J. Has. Penelit. dan Kaji. Kepustakaan di Bid. Pendidikan, Pengajaran dan Pembelajaran, vol. 9, no. 2, p. 541, 2023, doi: 10.33394/jk.v9i2.7110.

M. J. Budiman and Fanny Jouke Doringin, “Jurnal Ilmu Komputer,” Biomaterials, vol. 07, no. 12, pp. 85–90,




DOI: https://doi.org/10.29040/ijcis.v6i3.247

Article Metrics

Abstract view : 0 times
PDF - 1 times

Refbacks

  • There are currently no refbacks.


situs toto

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License