This work presents a proof of concept for a system that aims to teach a robot's Digital Twin (DT) how to interact with objects through human demonstration, with the ultimate goal of transferring the knowledge learned to the real robot. This is particularly useful in scenarios where the real robot is in remote or dangerous areas that cannot be accessed by humans. It is worth noting that the system primarily focuses on achieving an initial end-to-end implementation, rather than on the learning component. The proposed system uses shape features extracted by using an RGBD camera and calculated by Computer Vision (CV) techniques to enable the DT to interact with the virtual version of real objects. The system is divided into four phases: demonstration, learning, execution, and evaluation, and is tested using objects of increasing shape complexity. The primary challenges of accurately translating real interaction into virtual interaction, detecting shape features using CV techniques, and ensuring feasible actions are addressed. The potential applications of this project include hazardous materials handling, manufacturing automation, and other scenarios where robots must interact with objects in various settings. The project aims to enable robots to perform complex interactions with objects without human intervention, increasing efficiency and safety in various industries.