A visual Scene Graph (VSG) is a visually-grounded graph over objects in an image, where the edges represent the relations between the objects. Visual and semantic information is extracted from image objects and processed by the relation inference module. One main challenge in VSG generation is that the training data is highly imbalanced, only a few relations dominate the categories of predicates. Existing solutions mainly rely on alternative loss functions or data-level approaches like sampling. This paper addresses the long-tail problem in VSG from a new perspective, we enrich the semantic information using a tailored embedding based on Common Sense Knowledge Graphs (CSKG). We first study the relatedness and explore the gap between the visual domain graphs and CSKG, highlighting their differences. To bridge the gap, we investigate the effect of different knowledge graph embedding (KGE) techniques and sources of CSKGs. Our study shows understanding the gap between the two tasks, i.e., KGE and VSG, and the nature of the data is crucial to designed a specific embedding for the long-tail problem. Our proposed solution can be efficiently created in an off-line manner and used as a replacement to other existing embeddings. And our method can be combined with other de-biasinz techniques to further improve the efficacy.