In order for dialogue systems to continue to be used in everyday situations in the home, we believe it is effective for agents to have preferences and initiative and to have a position other than dialogue in everyday life. In this study, we aimed to build a collaborative music listening agent with the following functions: 1) to form preferences based on the songs it listened to; 2) to recommend songs based on the preferences of itself and its partner; 3) to play music for which preferences are unknown (curiosity-driven recommendation); 4) to select utterances for which the response of the partner is likely to be positive; 5) to try utterances for which the response of the partner is difficult to predict (curiosity-driven dialogue), and; 6) to control the dialogue based on the estimation of knowledge held by a large-scale language model. This paper discusses the future issues and prospects revealed by constructing a prototype version.