INFORMATION RETRIEVAL IN LARGE LANGUAGE MODELS
Loading...
Links to Files
Permanent Link
Author/Creator
Author/Creator ORCID
Date
2023-01-01
Type of Work
Department
Computer Science and Electrical Engineering
Program
Computer Science
Citation of Original Publication
Rights
This item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu
Distribution Rights granted to UMBC by the author.
Access limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan thorugh a local library, pending author/copyright holder's permission.
Distribution Rights granted to UMBC by the author.
Access limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan thorugh a local library, pending author/copyright holder's permission.
Subjects
Abstract
Large Language Models (LLMs) have been used to retrieve information. While this is an exciting opportunity to reduce costs in performing user studies and speeding up social science research (which is often bottlenecked by user study costs), this also opens up opportunities for harm. In particular, LLMs have been shown to generate random outputs. In this project, we aim to investigate how skewed LLMs are in their demographic predictions of the US population by comparing them to PEW research surveys. For this research, we are using one of the biggest and most commonly used LLMs, GPT 3.5 (text-DaVinci-003). I hope that this result sheds light on the gap that exists in information retrieval with LLMs and shows some room for more future work as we try to improve the use of LLMs, in this case, ChatGPT.