A new algorithm of signal processing for robust speech recognition using multiple microphones is described. The algorithm, loosely based on human binaural perception, consists of imposing time‐aligning delays on the speech signals from each microphone and passing the delayed speech through a bank of bandpass filters and nonlinear rectifiers. The outputs of the nonlinear rectifiers within each frequency band are cross‐correlated, providing an estimate of the spectral profile of short‐term energy in the speech signal that is resilient to the presence of off‐axis noise sources. A cepstral representation of these energy estimates is used as the feature set for automatic speech recognition using the CMU SPHINX system. The multichannel cross‐correlation‐based algorithm was found to preserve the shape of vowel spectra in additive noise, and it provides better recognition accuracy than is obtained using equivalent single‐channel processing with nonclosetalking microphones. The performance of this system was compared to that obtained using delay‐and‐sum beamforming and conventional adaptive filtering approaches. Finally, some implications of these results for human binaural hearing will be commented on. [Work supported by Motorola and DARPA.]