Artificial Intelligence (AI) has the potential to fundamentally change the educational landscape. So far, much of the physics education research relating to AI has focused on lecture-based assessment and the ability of ChatGPT to answer conceptual surveys and traditional exam-style questions. In this study, we shift the focus by investigating ChatGPT's ability to complete an introductory mechanics laboratory activity by using Code Interpreter, a recent plugin that allows users to generate and analyse data by writing and running Python code `behind the scenes'. By uploading a common `spring constant' lab activity using Code Interpreter, we investigate the ability of ChatGPT to interpret the activity, generate realistic model data, produce a line-fit, and calculate the reduced chi square statistic. By analysing our interactions with ChatGPT, along with the Python code generated by Code Interpreter, we assess how the quality and accuracy of ChatGPT's responses depends on different levels of prompt detail. We find that although ChatGPT is capable of completing the lab activity and generating plausible-looking data, the quality of the output is highly dependent on the detail and specificity of the text prompts provided. We find that the data generation process adopted by ChatGPT in this study leads to heteroscedasticity in the simulated data, which may be difficult for novice learners to spot. We also find that when real experimental data is uploaded via Code Interpreter, ChatGPT is capable of correctly plotting and fitting the data, calculating the spring constant and associated uncertainty, and calculating the reduced chi square statistic. This work offers new insights into the capabilities of Code Interpreter within a laboratory setting and highlights a variety of text-prompt strategies for the effective use of Code Interpreter in a lab context.
Comment: 17 pages, 7 figures