Abstract Background Starch is a complex branched glucose polymer, mainly comprising amylose and amylopectin. The number of individual chains as a function of the number of monomer units they contain, i.e. the chain-length distributions (CLDs), are controlled by the underlying biosynthetic process occurring during plant growth. CLDs are currently commonly related to biosynthetic processes and to functional properties by dividing into arbitrarily chosen regions. However, this empiricism is not completely satisfactory: conclusions can depend on the choice of division. Biosynthesis-based models replacing this arbitrary division by model-based parameterization have been developed, but are presently rarely used, because of the complex underlying mathematics. Scope and approach These models are summarized in non-mathematical language. These give information on the biosynthetic processes producing the starch, and yield a parameterization of CLD data to give a fit that is essentially the same as experiment. Additionally, the models are sufficiently flexible that they can also fit data for modified starches. This enables the whole CLDs for both amylose and amylopectin as a small number of parameters which can be used to find statistically-valid structure-property relations. Key findings and conclusions The underlying theory and data-fitting methodology can be used both to better understand starch biosynthesis, to see what structural features control functional properties and to deduce mechanisms for observed correlations. This enables raw materials to be chosen in non-empirical ways to select and process grains and other starch sources for improved foods and other products. Highlights • Starch molecular structure: a major determinant of properties of starch-based foods. • The fundamental structural level is the chain-length distribution (CLD). • Finding structure-property relations needs non-empirical CLD parameterization. • Biosynthetic models enabling this are explained in a non-mathematical way. • This enables publicly-available code to be readily used to find such relations. [ABSTRACT FROM AUTHOR]