Data integration from heterogeneous data sources in low-voltage (LV) power distribution grids will be valuable to distribution system operators (DSOs). The power measurements from customer premises need to be processed with other data such as grid topology, line parameters etc., to deploy smart grid applications (SGA) such as real-time grid monitoring and voltage regulation. The most challenging task for DSOs is to collect and integrate data from several sources as several entities are involved in the data management and access to databases are restricted. This paper presents an op E n common information ${M}$ odel (CIM) BA sed sma ${R}$ t grid application framewor ${K}$ ( EMBARK ) to address the above-mentioned challenge. EMBARK is developed to be an efficient, modular and scalable architecture for extracting relevant grid related information from various asset management databases. A novel data management functionality is a part of EMBARK that enables data-driven update of settings and parameters of the algorithms behind smart grid applications. The proposed approach is demonstrated and numerically validated using grid data from a medium-sized distribution grid operator in Denmark. The architecture developed and presented in this paper can support all the phases from planning to the actual smart grid operation, i.e., automatically building the models to perform load flows, grid impact studies, planning, asset management etc.