Voice activity detection (VAD) is a very challenging problem in adverse acoustic environments (e.g. far-field and conditions with different types of noise). In this paper, we proposed a Gaussian mixture model (GMM) for log-energy distribution of noise and (noisy) speech, where the distribution of these two components can be self-adapting in non-stationary circumstances. An adaptive threshold based on the GMM parameters of these two components represents a reasonable bound between noise and speech, which can lead to an accurate VAD in various noise conditions. To further improve speech hit rate (SHR) and non-speech hit rate (NHR), some constraints are introduced to this proposed GMM for reliability. Experimental results demonstrate that the proposed method yields remarkable performance for SHR and NHR.