Recommender systems have been widely deployed in various business platforms. Traditional models usually focus on recommending items of users’ interests, thus the accuracy is one major concern. However, many accurately recommended items would have been clicked by users even without recommendation. Given this, the causal effect was introduced to measure the difference of outcome (clicked or not) if an item is recommended or not. Previous works demonstrate that, most existing causality metrics are defined on all the items and suffer large variance. In this paper, we propose a new metric called R-CATE that measures the causal effect of the top-k recommended items, which can be regarded as a hybrid of accuracy and causality measures. Towards the R-CATE performance of recommendation lists, we design a learning-to-rank model that disentangles users’ interest embedding and causality embedding. The experiments on two semi-synthetic datasets show that, our proposed model is superior to other typical accuracy or causality oriented methods.