학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

17.8 0.4V 988nW Time-Domain Audio Feature Extraction for Keyword Spotting Using Injection-Locked Oscillators

Resource Type: Conference
Authors: Mostafa, Ali; Hardy, Emmanuel; Badets, Franck
Source: 2024 IEEE International Solid-State Circuits Conference (ISSCC) Solid-State Circuits Conference (ISSCC), 2024 IEEE International. 67:328-330 Feb, 2024
Subject: Bioengineering
Communication, Networking and Broadcast Technologies
Engineered Materials, Dielectrics and Plasmas
Photonics and Electrooptics
Robotics and Control Systems
Band-pass filters
Ring oscillators
Injection-locked oscillators
Rectifiers
Voltage
Speech recognition
Feature extraction
Language
ISSN: 2376-8606

Online Access

Full Text (IEEE)

초록

Always-on, voice-activated tinyML systems, like those implementing keyword spotting (KWS), demand low power consumption and a small footprint. In certain instances, sub-V energy-harvesting sources restrict the available supply voltage to below 0.5V [1]. Most KWS designs focus on optimizing the audio feature extraction (FEx) unit, which dominates the overall power and area. Analog FEx using multi-channel Gm-C bandpass filters (BPFs) and analog rectifiers [2], [3] can be as much as 10× more power efficient than digital FEx for a comparable silicon area [4]. However, analog FEx circuits have not demonstrated KWS with more than four keywords. They also suffer from a large footprint, challenging technology migration and limited dynamic range (DR) at low supply voltage, while speech signals have inherently a high DR. These limitations ultimately lead to the use of time domain (TD) [5], [6], or partial TD [7] alternatives. In [5], a 0.5V solar-powered TD-FEx employs a voltage-to-time converter (VTC) followed by ring oscillator (RO)-based BPFs to achieve 86% classification accuracy on 10 keywords, but it consumes an order of magnitude more power (9.3μW) than the existing state of the art [2], [3]. In [7], Gm-C BPFs followed by VTCs are used to enable time domain rectification at 0.4V, but the analog part operates with at least 0.6V supply. A solution operating at 0.4V is presented in [6]. It uses injection-locked oscillators (ILOs)-based bandpass filters to process the signal directly in the phase domain, but KWS was not demonstrated due to the limited filter selectivity. To achieve high speech recognition accuracy, a quality factor (Q) of at least 4.05 is required for 16 log-spaced channels (125Hz-5kHz with -3dB crossover), which is not achieved in [2]–[6] (Q

공지

DAU Library

학술논문

요약정보

17.8 0.4V 988nW Time-Domain Audio Feature Extraction for Keyword Spotting Using Injection-Locked Oscillators

Online Access

초록