Workshop 2: Using LLMs for content generation and evaluation in assessment development

Andrew Runge, Yena Park & Yigal Attali

Participants will gain an understanding of how LLMs work beyond the ChatGPT interface and engage in hands-on activities that involve tweaking and adding to Python code to interact with GPT programmatically to generate and evaluate test content, including passages and items (keys and distractors).

Intended learning outcomes

  • The ability to adapt the code to generate expository and argumentative passages with desired attributes
  • The ability to filter passages appropriately using self-drafted content review guidelines
  • The ability to generate candidates for options to be used in multiple-choice questions
  • Understanding of the limitations of using LLMs for content generation
  • Understanding of different methods to evaluate generated key and distractor candidates

Content of the workshop

  • Passage generation based on different criteria (e.g., genre, level of lexical/syntactic complexity)
  • Passage evaluation using generic and custom filters
  • Item generation of main point and inference questions
  • Item evaluation based on NLP-driven metrics

Engagement methods

We will have a Jupyter notebook of multi-step prompting with GPT using Python where workshop participants can click on code blocks to execute the commands, make minimal changes to the existing code that follow the pre-established pattern requiring no prior coding experience, and see the outcomes of their changes.

Participant background

  • Ability to recognize patterns
  • Ability to follow existing patterns

Pre-workshop activities

None


 

Workshop 2

Andrew Runge

Yena Park

Yigal Attali