System-Level Test (SLT) emerged as an additional test step to detect manufacturing defects not caught by traditional testing. For SLT, the Device Under Test (DUT) is embedded into an environment that emulates the end-user application as closely as possible and runs workloads composed of existing off-the-shelf software. We present an automatic greybox SLT program generation method to find code snippets that control the DUT’s extra-functional properties, to achieve better characterization, or to improve the coverage of emerging defect types. In contrast to ATPG or formal methods, our method does not require structural information and relies solely on simulation results or hardware measurements to guide the generation. We show that our method outperforms hand-crafted snippets on a RISC-V super-scalar processor and look into possible reasons why the snippets perform the way they do.