Workshop 2: Using LLMs for content generation and evaluation in assessment development
Andrew Runge, Yena Park & Yigal Attali
Participants will gain an understanding of how LLMs work beyond the ChatGPT interface and engage in hands-on activities that involve tweaking and adding to Python code to interact with GPT programmatically to generate and evaluate test content, including passages and items (keys and distractors).
Intended learning outcomes
- The ability to adapt the code to generate expository and argumentative passages with desired attributes
- The ability to filter passages appropriately using self-drafted content review guidelines
- The ability to generate candidates for options to be used in multiple-choice questions
- Understanding of the limitations of using LLMs for content generation
- Understanding of different methods to evaluate generated key and distractor candidates
Content of the workshop
- Passage generation based on different criteria (e.g., genre, level of lexical/syntactic complexity)
- Passage evaluation using generic and custom filters
- Item generation of main point and inference questions
- Item evaluation based on NLP-driven metrics
Engagement methods
We will have a Jupyter notebook of multi-step prompting with GPT using Python where workshop participants can click on code blocks to execute the commands, make minimal changes to the existing code that follow the pre-established pattern requiring no prior coding experience, and see the outcomes of their changes.
Participant background
- Ability to recognize patterns
- Ability to follow existing patterns
Pre-workshop activities
None