Date: 29 Nov 1995 From: Jean-Pierre Lereboullet To: ajr@eng.cam.ac.uk Subject: VoiceRecognitionProcessorsDATABASE29/11/95 +---------------------------------------------+ | VOICE RECOGNITION PROCESSORS (DATABASE) | | 29/11/1995 Version | +---------------------------------------------+ I would like to thank Tony Robinson for place this Database into the comp.speech FTP site (svr-ftp.eng.cam.ac.uk) under comp.speech/info/VoiceRecognitionProcessors. It's also available via Netscape at: ftp://svr-ftp.eng.cam.ac.uk/pub/comp.speech/ info/VoiceRecognitionProcessors I would like to thanks Andrew Hunt for including a reference to this document in his powerful monthly FAQ. --------------------------------------------------------------- This Mail deals with the VRPs I've met. Most of them are IWR and SD systems. Perhaps some are missing, so please mail me to add them, or to complete them. Fax : (33 1) 43.31.71.88 adress : Jean-Pierre LEREBOULLET 63, rue Pascal, 75013 PARIS FRANCE --------------------------------------------------------------- VOICE RECOGNITION PROCESSORS (VRP) Abbreviations : SD : Speaker Dependant SI : Speaker Independent IWR : Isolated Word Recognition CSR : Continuous Speech Recognition (Word Spotting) Section 1 : WHAT VRP ARE AVAILABLE TODAY? * DVC306 (New!!!) (DSP Communications,Inc.) * D6106 (DSP Communications,Inc.) * HM2007 (Hualon) * MSM6679 (OKI semiconductor) * RSC-164 (Sensory Circuits) * TC8860F, 64F, 65F (Toshiba) * 5A128,custom (Ricoh) Section 2 : VRP OF THE PAST (looking for information about them) * Voice III (Advanced Products & Technology) * Voice (Asulab) * VX2, Voice Master (Citizen) * MN 1263 (Matsushita/Panasonic) * ... (NTT) * uPD7761 & uPD7764 (NEC) * SRB32 (Sanyo) Section 3 : VRP OF THE FUTURE (looking for information about it) * ... (Ricoh) --------------------------------------------------------------- +----------------------------------------------+ | Section 1 : WHAT VRP ARE AVAILABLE TODAY? | +----------------------------------------------+ Name : DVC306 (September 95) Made by : DSP Communications, Inc. 20300 Stevens Creek Blvd. suite 465 Cupertino, CA 95014 USA Tel: 408-777-2700, Fax: 408-777-2770 Kelly BIRMINGHAM, Marketing Manager, Tel : 408-777-2770 Gity BARGHI RAINEY, Mktg. & Sales Administrator, Tel: 408-777-2762 DSP Communications, Ltd. 11 Ben Gurion Street, Givat Shmuel, 51905, ISRAEL Gabriel Hilevitz Tel : 972-3-531-3300, Fax : 972-3-531-3303 DSP Communications, Inc. Gotanda Alpha Bldg, 9th Floor 29-9, 2-Chome Nishi Gotanda Shinagawa-Ku, Tokyo 141, Japan Tel : 81-3-5496-1611, Fax : 81-3-5496-1615 Price : $10 Nb of words : 16-128 words for up to 8 users in SD mode (32k-1M SRAMS) Response times in SI 500ms, SD 300ms. Active vocabular size <30 words recommended. SI Memory : 8Kbytes +400 bytes/word SD Memory : Approximately 240 bytes/word Specifications : CSR, SI & SD. This single DSP chip is a dual mode (SI and SD) voice command processor. Highly robust to external noise (>10dB), the DVC306 achieves high recognition rates even in noisy environments (car for example). In addition, this new product offers speech synthesis (for feedback and prompting), KEY-WORD SPOTTING, adaptive audio cancellation (2 audio inputs), and memo pad recording (for short message) functions. Power suply : 3-5V, Temperature Range 0=B0C to 70=B0C, DVC306 is suplied in an 100 pin TQFP. You have to add: - a CODEC -8 bit microlaw PCM or 16 bit linear (8Khz Sampling rates/ Input dynamic range : 40dB). - SRAM configurations from 32K to 1M bytes corresponding to speaker dependent vocabularies range. - EPROM/ROM configurations from 32K to 1M bytes corresponding to compressed prompts and speaker independent templates. - HOST. Any standard low cost 4 or 8bit micro- controller may be used. The interface consists of an 8bit bi-directional bus and control signals. - XTAL. a 32 Mhz crystal for operation. Demo board : EVS306 available for $500 Example of application : Integrated Handsfree Voice Dialer Voice Command for PDA's consumer & industrial. Distributors : DSP Communication,Inc. (same address) --------------------------------------------------------------- Name : D6106 Made by : DSP Communication, (same address) Price : $36.7 for small quantities, the price in hundreds is $22.4 and below $15 in tens of thousands. Nb of words : 16-128 words (between 0,2 and 1,1 sec per word) (response time : 0,3 sec in quiet mode, less than 1 sec in noisy mode for 16 words) Specifications : IWR and SD. The processor is specially optimized to operate in a noisy environment such as in a car. Power suply : 5V, Temperature Range 0=B0C to 70=B0C, D6106 is suplied in an 80 pin PQFP. You have to add: - a CODEC (a standard microLaw PCM codec) - SRAM configurations from 1*32 to 2*128 Kbytes to store the recognition parameters, templates and compressed trained words. - EPROM/ROM configurations from 1*32 to 2*128 Kbytes corresponding with 0,56 minute to 4,5 minutes of compressed speech. - HOST. Any standard low cost 4 or 8bit micro- controller may be used as a host. - XTAL. a 29.491 Mhz crystal for operation. Demo board : EVS6106 available for $500 Example of application : Integrated Handsfree Voice Dialer in a Cellular Subscriber Unit. Distributors : DSP Communication, Inc USA(same address) --------------------------------------------------------------- Name : HM2007 Made by : HUALON Microelectronics (HM) Corporation Countrie : Taiwan Price : $25 for a single quantity, ...... Nb of words : 40 words (0.9 sec each word) or 20 words (1.9 sec each word) Specifications : IWR and SI.It's a single chip CMOS LSI chip. 48 pin plastic DIP package, 5V single power supply, 6mA operating current (idle), 15mA operating current (max), 70=B0C max operating temperature, response time less than 300 ms. It can work with an electret microphone, It needs an external 8Kbyte Static RAM, and an external micro-controller. (The complete specifications was available by sending a E-mail to "mecoinfo@indirect.com" with "help" on the subject line... I can mail it to you if you ask me...) Demo board : $100 Example of application : this products seems to be good for toys Distributor : Mr John Panico Images Company tel: 1-800-230-4535 (inside the US) tel: (718) 698-8305 (international) This two distributors no longer carry this chip: Marywale Engineering Compagny The Summa Group Limited PO Box 23786 One California Stree Phoenix, Arizona, 85063 Suite II 1940 USA SAN FRANCISCO, CA 94111 tel : (602) 247-6167 (home) tel : (415) 288 0390 fax : (602) 247-4451 fax : (415) 288 0390 --------------------------------------------------------------- Name : MSM 6679 (alias VRP 6679) Made by : OKI SEMICONDUCTOR GROUP Corporate Headquarters 785 North Mary Avenue Sunnyvale, CA94086 2909 USA Tel : 408/720-1900 Fax : 408/720-1918 Price : 20$ Nb of words : 25 words max. with a single chip but you can add external RAM (50Bytes/word) and have different corpus of 30 words. Specifications : IWR, SD (with 2 pass training) and SI (Yes it's true it's a simultaneous SD and SI system) OKI's MSM6679 it's a REVOLUTION : it's a single chip speaker- independent (speaker-dependent capability - for user-defined customization) recognition solution, with high recognition rate (97% in a car), proprietary VCS (Voice Control System from Dallas) algorithm, based on "dynamic time warping" and "hidden Markov models". It works with a single 5V supply, with NO external CPU, -40 to 85=A1C... BUT if you want to create a SI with a specific vocabulary, it costs $200 per new word and $5000 for the globaly "Mask Charges" OKI told me that the MSM6679 is a SD chip - industry first. If you want the complete description call 1-800-OKI-6388 and ask for the"package 86". Demo board : VRP serial port interface for PC, $876. Example of applications : computers (demo available), cellular phone (demo available), automotive, vending machines, handicap aids, industrial controls, measuring equipement, diagnostic instrument, games/toys... Distributors : Contact your OKI Regional Sales Offices for other information. The MSM6679 seems to be available at the END of MAY 95. USA: Northwest Area : Tel 408/720-8940, Fax 408/720-8965 Southwest Area : Tel 714/752-1843, Fax 714/752-24-23 Central Area : Tel 214/690-6868, Fax 214/690-8233 Southwest Area : Tel 404/960-9660, Fax 404/960-9682 Eastern Area : Tel 508/688-8687, Fax 508/688-8896 OKI Automotive Electronics (Livonia near Detroit), call Mr Stan KULESA at Tel 313/464-7200 Email : Stan_Kulesa@quickmail.okisemi.com GERMANY (EUROPE) : OKI ELECTRIC EUROPE GmbH Hellersbergstasse 2, D-41460 Neuss Call Mrs Sibylle EBERT at Tel 0 21 31/15 96-0, Fax 0 21 31/10 35 39 FRANCE : OKI (France) 148, rue de Chevilly, 94240 L'HAY Les ROSES Call Mrs Maguy POMMIES at Tel (1) 45 60 03 28, Fax (1) 49 78 09 58 --------------------------------------------------------------- Name : RSC-164 (Recognition-Synthesis-Control) RSC-164i (a reduced cost version with the external memory bus removed) Made by : SENSORY CIRCUITS Inc. 1735 Nort First St. Suite 313 San Jose, California 95112-4511 Tel : (408) 452-1000 , Fax : (408) 452-1025 E-mail : sales@sensoryc.com Contacts : Jeff Rogers (Sales Managers) web site http://www.sensoryc.com/. Countrie : USA Price : $3.75 each in large volumes Nb of words : 20 - 100 words on chip, unlimited with off-chip ROM Specifications : IWR and SI or SD. It employs a neural network.Up to 40 seconds of speech playback on-chip. External memory bus for large vocabularies. ADPCM voice record and playback with external RAM. 4MIPS-8bit RISC processor for general product control. 3-6V supply, <10uA standby and -5uA operating. 14-16 programmable general purpose I/O lines. 68 pad CMOS; on-chip filters, A/D, and D/A An AGC (Automatic Gain Control) circuitry allows speaking at different vocal volumes or at varying distances from the microphone. Custom vocabulary ROM data can be encoded for a masking charges of $3500; vocabulary-development charges start with a base fee of $2500 and $250/word for recognition, and $500 per recording session plus $50 to $100/word for synthesis. Royalty-free sound effects are available for $100/s. Demo board : Samples available for Sensory partners. Example of application : Toys, Electronics Learning Aids, Watches. Distributors : These products are now shipping. --------------------------------------------------------------- Name : TC 8860F (alias TC 8060F00BS) Made by : TOSHIBA CORPORATION 1-1, Shibaura 1-Chome, Minato-ku, Tokyo, 105-01,JAPAN Tel (03) 3457-3914, fax (03) 3451-0576 Price : 40.15 FF (French Franc : $1 =3D 5 Fr) Nb of words : 10 words Specifications : IWR and SD. Linear matching method (Toshiba has developed its own algorithm), Internal 4K-bit RAM for recording memory, oscillation frequency : 800kHz, Power supply : 4.5 - 5.5 V, Current operation : 4.5mA, Current standby : 0.2microA, Package MFP44,. Demo board : TBP88D60 Not available, but really easy to create. You just have to add battery, a microphone, a key matrix (10-15 buttons), and you have you result with 10 LEDs. Ask Toshiba to send you the documentation call "Speech LSIs data book", it's free....and you have everything to develop the TC 8860F (it's really easy) Example of applications : Toys... Distributors : This product is available in Europe (made in germany). For France it's : Arrow Electronique S.A. 73/79 rue des solets - silic 585 - 94663 Rungis Cedex France Call Mrs Joelle TISSERANT Tel 33(1) 49 78 49 78, Fax 33(1) 49 78 05 96 ------------------------------------------------------------------- Name : TC8864F-00 + (TC8861F + 64K-bit SRAM) Made by : TOSHIBA CORPORATION (same address) Price : ? Nb of words : 50 words Specifications : IWR and SD. Multiple similarity method, Speaker Specific,64 K-bit external memory, for GENERAL ENVIRONMENT USE. First mode 2Mhz, 4.5-5.5V, Operation 13mA, standby 0.2microA, packageMFP60. Second mode 8Mhz, 4.5-5.5V, Operation 5mA, standby 0.2microA, package FP100. Ask Toshiba to send you the documentation call "Speech LSIs data book", it's free....and you have everything to develop this chip. Someone told me that this chip is one of the most performant in high noise environment... He found a interessant solution : He's recording some kind of typacally background sounds as single words to prevent the starting of the chip...So, these disturbing sounds have not effect... Demo board : TBP88D64, price unknow Example of application : everything in an general environment... Distributors : (Same adress) ------------------------------------------------------------------- Name : TC8865F-00 + (TC8861F Codec A/D+ 64K-bit SRAM) Made by : TOSHIBA CORPORATION (same address) Price : TC8865F =3D 86 Fr (French Francs) TC8861F =3D 86 Fr too Nb of words : 20 words Specifications : IWR and SD. Multiple similarity method, Speaker Specific,64 K-bit external memory, for HIGH NOISE ENVIRONEMENT USE. Firstmode 2Mhz, 4.5-5.5V, Operation 13mA, standby 0.2microA, packageMFP60. Second mode 8Mhz, 4.5-5.5V, Operation 5mA, standby 0.2microA, package FP100. Ask Toshiba to send you the documentation call "Speech LSIs data book", it's free....and you have everything to develop this chip. Demo board : TBP88D65 available for 4437.20 Fr (French Francs) Example of application : everything in an high noise environment... Distributors : (Same adress) ------------------------------------------------------------------- Name : 5A128 (including One Time progammable ROM, for development and evaluation) or 5S830 (including Mask ROM, for mass production) Made by : RICOH CORPORATION Electronic devices Division San Jose Office 3001 Orchard Parkway, San Jose, CA 95134-2088, USA Tel : 1-408-432-8800, Fax 1-408-432-8375 RICOH COMPANY,LTD Electronic Devices Division * Headquarters : 13-1, Himemururo-cho, Ikeda City, Osaka 563, JAPAN Tel : 81-727-53-1111, Fax 81-727-53-6011 * Yokohama Office : 3-2-3, Shin-Yokohama, Kohoku-ku, Yokohama City, Kanagawa 222, JAPAN Tel : 81-45-477-1697, Fax 81-45-477-1694 or 1695 Countrie : USA-JAPAN Price : $7-10 depending on quantities. Nb of words : 10 words in the SI (speaker independent) mode (Max.=20 5sec/word) 3 words in the SD (speaker dependent ) mode (Max.=20 2sec/word) Specifications : IWR and SD and SI. One chip Voice Recognition Controller. Power supply voltage: 5V +-10%, Up to 50 mA in operating mode, Up to 50 microA with RAM backup in slepp mode, Package 80 pin Plastic Flat Package. Demo board : 3000$ for developpement and evaluation Example of applications: Toys, dialer, activated equipement... Distributors : All RICOH's distributors In FRANCE it's: Micro Puissance 1, av de Norvege, ZA de courtaboeuf, BP 79 91943 Les Ulis cedex France Call Mrs Andre Filleau at tel : (33 1) 69 07 12 11. Ricoh will also present another voice recognition chip based on the more advanced word spotting technique. This product will be available later this year (Read the section 3 "VRP of the future"...) ------------------------------------------------------------------- Name : RF5A65 (Voice Recognition Processor) RF5C72-0021 (Feature Extraction Chip DSP) Made by : RICOH CORPORATION Countrie : USA-JAPAN Price : ? Nb of words : 60 words in the SI (speaker independent) mode (0,15-2,0sec) 120 words in the SD (speaker dependent ) mode (0,15-2,0sec) in SD mode there's 3 times of training. IWR's isolation time : min.250msec to 350msec. Specifications : IWR and SD and SI. These chips were made for the RVZ200 general purpose board before creating specific 1 chip set solution. This board uses Ricoh's BTSP matching technique. Protocol commands between a user's system and the unit=20 board is available for using a serial interface (RS-232-C) or a parallel port (8bit). The noise reduction is spectral substraction method using 2 microphones in order to recognize words in a noisy circumstance (for example in a car). Demo board : RVZ2000, 3000$. Example of applications: Car, dialer, activated equipement... Distributors : All RICOH's distributors --------------------------------------------------------------- +------------------------------------------------------+ | Section 2 : VRP OF THE PAST | | (looking for information about them) | | ( Perhaps they are available today ) | +------------------------------------------------------+ Name : SRB32 Made by : SANYO Countrie : Japan Price : ? Nb of words : 32 words Specifications : Seems to be developped for an automotive application... Demo board : ? Distributors : ? ------------------------------------------------------------------- Name : uPD7761/MC4760 Made by : NEC Countrie : Japan Price : ? Nb of words : ? Specifications : IWR and SD. Demo board : its name seems to be "K3" or "K2". Distributors : NEC has stopped this product since 1988. --------------------------------------------------------------- Name : uPD7764 Made by : NEC Countrie : Japan Price : ? Nb of words : ? Specifications : CSR and SI. It's a Speech Recognition Processor dedicated to pattern matching, and it supposed to do speaker dependent, connected words.... Demo board : ? Distributors : NEC has stopped this product since 1988. --------------------------------------------------------------- Name : MN1263 Made by : MATSUSHITA (PANASONIC) Countrie : Japan Price : $30 Nb of words : 24 words Specifications : IWR and SD Demo board : ? Distributors : ? --------------------------------------------------------------- Name : ? Made by : NTT Countrie : Japan Price : ? Nb of words : 32 words Specifications : IWR and SD Demo board : ? Distributors : ? --------------------------------------------------------------- Name : VOICE III Made by : ADVANCED PRODUCTS & TECHNOLOGY Countrie : USA Price : ? Nb of words : 128 words in SI and 6400 words in SD. Specifications : IWR and SI or SD. Avanced Products & Technology seems to realise "custom chip" with this name : "VoiceIII"... Demo board : ? Distributors : ? --------------------------------------------------------------- Name : VOICE Made by : ASULAB Countrie : Swizerland Price : ? Nb of words : 15 words Specifications : IWR and SI, it was made for asking your watch to tell you the time. Demo board : ? Distributors : ? --------------------------------------------------------------- Name : VX2 or/& Voice Master Made by : CITIZEN Countrie : Japan Price : ? Nb of words : 27 words Specifications : IWR and SI, it was made for asking your watch to tell you the time, but seems to have disappeard too. Demo board : ? Distributors : ? --------------------------------------------------------------- +------------------------------------------------------+ | Section 3 : VRP OF THE FUTURE | | (looking for information about it ) | | ( Perhaps they are available today ) | +------------------------------------------------------+ Name : ? Made by : RICOH CORPORATION Price : ? (Low cost) Nb of words : 60 words in the Speaker Dedependent mode (SD). 30 words in the Speaker Independent mode (SI). Specifications : CSR (word spotting) and SD and SI. One chip Voice Recognition LSI, Recognition method : Duration Based State Translation=20 Model. The LSI includes main components (Mic AMP., ADC, memories..), High robustness to background noise (in a car), Power supply voltage: 4,75-5,25V, Operating Clock Frequency: 16Mhz, Operating Current: Up to 100mA, Package: 80 pin Plastic Flat Package, Parallel port for host micro controller. Demo board : ? Distributors : All RICOH's distributors at the end of 1995. The first announce of this product was made on Feb.10,1995. --------------------------------------------------------------- Well, that's all. If you want to ask me any questions, contact me