Benjamin Reaves 405 El Camino Real #458, Menlo Park, California 94025 USA tel +1 650 924 2367. email benreaves@stanfordalumni.org Resume brss.us/r Linkedin linkedin.com/in/benReaves
| Patents |
Method and System for Adjusting the Voice Prompt of an Interactive System Based Upon The User's State
U.S. Patent 7,881,934, January 2011
Method and system for speech recognition using grammar weighted based upon Location Information U.S. Patent 7,328,155, February 2008 Shortcut names for use in a speech recognition system U.S. Patent 7,292,978, November 2007 Speech recognition system having multiple speech recognizers U.S. Patent 7,228,275, June 2007 Speech Detection in the Presence of Noise by Determining Variance over Time of Frequency Band Limited Energy, U. S. Patent 5,579,431, November 1996. Continuation in part: patents 5,617,508 (November 1997); 5,826,230 (October 1998).
|
| Research Papers |
Nogaito, I., Tanaka, M. Reaves, B., Nawa, K. 2009.
An adaptation of Attribution Between an In-vehicle Device and a Driver.
In
Jidousha Gijutsukai (Society of Automotive Engineers - Japan) Spring conference
(Nagoya, Japan, May 21, 2009)
Link from SAE-J
)
Nass, C., Jonsson, I., Harris, H., Reaves, B., Endo, J., Brave, S., and Takayama, L. 2005. Improving automotive safety by pairing driver emotion and car voice emotion. In CHI '05 Extended Abstracts on Human Factors in Computing Systems (Portland, OR, USA, April 02 - 07, 2005). Association for Computing Machinery, Special Interest Group on Computer-Human Interaction (ACM SIG CHI) 2005. 1973-1976.) (ACM link DOI 10.1145/1056808.1057070) Jonsson, I., Nass, C., Endo, J., Reaves, B., Harris, H., Ta, J. L., Chan, N., and Knapp, S. 2004. Don't blame me I am only the driver: impact of blame attribution on attitudes and attention to driving task. In CHI '04 Extended Abstracts on Human Factors in Computing Systems (Vienna, Austria, April 24 - 29, 2004). ACM SIGCHI '04. 1219-1222.) (ACM link DOI 10.1145/985921.986028) Reaves, Ben and Nishino, Atsushi. ATR-Matrix: Implementation of a Speech Translation System. In Proceedings of the Acoustical Society of Japan Spring 1998. (first author). (link from Google Scholar ) (T. Takezawa wrote an update in Proceedings of the Information Processing Society of Japan March 1999) Junqua, Jean-Claude; Mak, Brian; Reaves, Ben A Robust Algorithm for Word Boundary Detection in the Presence of Noise. IEEE Transactions on Speech and Audio Processing July 1994 pp. 406-412. (full paper in a refereed journal.) (Link from Google Scholar.) Reaves, Ben. Parameters for Noise Robust Speech Detection. Proceedings of the Acoustical Society of Japan, Fall 1993. (sole author) ( Link from JIST ) Reaves, Ben and Junqua, Jean-Claude. Real-time Preprocessing for Speech Recognition Proceedings of the Acoustical Society of Japan, Fall 1992. ( Link from JIST ) Mak, Brian; Junqua, Jean-Claude; Reaves, Ben. A Robust Speech / Non-speech detection algorithm using time and frequency based features. (Link from IEEE) Reaves, Ben and Junqua, Jean-Claude. A Study of Endpoint Detection Algorithms in Adverse Conditions: Incidence on a DTW and HMM Recognizer, Eurospeech, Genoa, Italy, 1991. (International research conference.) link from Google Scholar Reaves, Ben, Comments on 'An Advanced Endpoint Detection Algorithm', IEEE Transactions on Acoustics, Speech, and Signal Processing, February 1991. Sole author. (Link from IEEE ) (DBLP link to some of the papers)
|
| Media and news articles related to my work |
Wang, Jinjun; Zhu, Shenghuo, Gong, Yihong, 2010.
Normalizing Multi-subject Variation for Drivers' Emotion Recognition
(Link to pdf)
Wang, Jinjun; Gong, Yihong, 2009. Normalizing Multi-subject Variation for Drivers' Emotion Recognition (Link to pdf) Yomiuri Newspaper front page, Yon ka kokugo no Nichijou Kaiwa ni Shunji Hon'yaku, July 23, 1999, front page. (link to scanned image ) All Things Considered (respected news broadcast in USA from National Public Radio) also described this translation system in 1999 at ( this link or this link .) as did several others including Gakken (students' magazine) and NHK (Japan well respected news - youtube video of English broadcast is at this link ) Yomiuri Daily News, Computerized translation system bodes well for borderless chat, July 1999, Kansai And West page, Link to scanned image. C-STAR International Joint Experiment Towards Interpreting Telecommunications: A Multilingual Speech Translation Experiment. July 1999. (Link to press release) This work continues here. Kansai Professional Computing Association 1998 June 17 link to introduction Reaves, Ben, Reading beyond a bad header with tar, Sys Admin the Journal for Unix System Administrators, vol. 2 no. 6, November 1993 link to reference |
| Toyota InfoTechnology Center, USA, Inc., Mountain View, California |
Senior Researcher. (Engineer 2002-2003; Assistant Manager 2003-2004; Research Manager 2004-2010; Senior Researcher 2010-)
Toyota InfoTechnology Center is Toyota's most forward-looking research center, with offices in Tokyo, California, New Jersey, and New York. Joint Venture primarily between Toyota Motor Corporation and KDDI, Inc. Responsible for implementation of several working prototypes for Driver-Vehicle Interaction systems, focusing on Speech Interaction, and Dialog Management taking into account the driver's cognitive load and driving conditions. Managed a diverse team of professional Engineers, Cognitive Psychologists, and other Scientists, and collaboration projects with several Universities and research institutes in the USA and Europe. Worked with local and Japanese upper management to promote technologies and studies to improve our understanding of the interaction between the driver and in-vehicle information systems. Used speech recognition and synthesis technologies from Nuance Communications, Cepstral, Loquendo, AT&T; Emotion Recognition from Affective Media and Nemesysco; STISIM Drive Simulator software from Systems Technology Inc.; Machine Learning from NEC Labs America; Sensors from Thought Technologies Inc. (Biograph, Flexcomp, Procomp); Agent Technology from Dejima, Answers Anywyere, and its parent company Sybase. Other technologies include PLX and Palmer for reading CAN-bus, Methode for head-position sensors, Attention Technologies Inc for pupil-shape detection, Vivometrics for breathing rate detection, and data-logging devices including DashDaq, Carchip, BP24 altimeter; SPSS for Statistical processing. Worked primarily with: Jack Norikazu Endo, Roger Melen, Akio Orii at Toyota; Cliff Nass at Stanford. |
|
|
|
| Dejima Inc, San Jose, California;
and the Center for the Study of Language and Information, Stanford University |
Research Engineer. January 2002 - October 2002.
Dejima corp. is a leader in agent network technology for natural language understanding. Based in San Jose, they have offices in Tokyo and London. Dr. Babak Hodjat CEO. Along with Ing Marie Jonsson, we brought in two projects from Oracle, and one from Toyota ITC. The first involves a serial port connection to a Kyocera telephone, based on the Palm OS, and user studies for a proposed speech recognition system with natural language interpretation. The second project from Oracle used the data gathered from the first, to build a Speech Accessor for a mobile database access device based on a reliable serial port connection. We used SRI Language Modeling Toolkit SRILM, and SRI's speech recognition software Dynaspeak. In the third project -- Toyota -- we combined technology from Nuance and Dejima and SRI for an in car telematics system prototype. In addition for the Summer of 2002 I was on the Archimedes Team at Stanford
University, applying Nuance and Dejima technology to a Total Access System
which will allow people with various disabilities to access computers and
computer networks. This is under Dr. Neil Scott,
who has since moved to the University of Hawaii.
|
|
|
|
| Nuance Communications, Menlo Park, California. Dialog Design group. |
Software Engineer. October 1999 - April 2001.
Nuance is a leader in voice recognition and speech recognition for IVR platforms. My team worked on the SpeechObjects product,
a set of reuseable Java code that embody the results of Nuance's Dialog Research
and best practices of Professional Services.
My work was measuring the performance of SpeechObjects on collected real data from the field.
I also Streamlined the build-test-release process for product management, and wrote the Advanced Features Tutorial
which has become the basis for several commercially deployed IVR products.
|
|
|
|
| Advanced Telecommunications
Research Laboratory: Interpreting Telecommunications Laboratory (ATR-ITL) Kyoto, Japan. |
Invited Engineer. July 1995 - October 1999
Began at ATR researching on MCE/GPD, a computationally intensive algorithm for training acoustic models, under Dr Shigeru Katagiri. ATR-ITL was created by a fixed duration contract from the Key Technology Center of the Japanese government, which stated that ATR-ITL must create a speech translation system. I took the responsibility of creating a prototype of this integrated system in 1996, after which I became manager of a group within ATR-ITL dedicated to building this speech-to-speech translation system. Rapid development of the prototype was made possible by the use of some powerful programming languages in particular Python and Tcl/Tk. By 1997 we had a working prototype that could be demonstrated to visitors. I brought together our group and researchers from each department of ATR-ITL to work on the various issues. The resulting system was shown at various international research conferences including ICASSP in Seattle, and was deployed with real users in an international translation network consisting of the C-Star-II symposium (link ) which includes research institutes in Germany, USA, France, Korea, among other countries. This software won the ATR Software award, and best-of-show for the 1997 Open House. Finally this system was shown at the Ministry of Posts and Telecommunications of the Japanese central government which then issued a new contract starting in 2000 creating a new company ATR-SLT (Spoken Language Translation) for a duration of 5 years, for several hundred million dollars. Worked primarily with Shigeru Katagiri, Harald Singer, Yoshinori Sagisaka, Nick Campbell, Hiroshi Iida, Sei-ichi Yamamoto, Toshiyuki Takezawa, Rainer Gruhn. |
|
|
|
| Asahi Culture Center, Osaka, Japan | Instructor. April 1995 - June 1995. (part time)
Taught their most advanced English course: "English for International Conferences." This was a night class, once per week. Students numbered about 30, most of whom where working at various Japanese companies in downtown Osaka during the day; some were retired. Director of the Center that the time was Kouichi Kanayama. I prepared all materials including videotapes and printed notes. Students gave presentations on subjects including the circulation of heat in prefabricated houses, and an argument that a crash of a JAL airplane outside Tokyo's Narita Airport was due to willful action of the pilot. We used H. J. Tichy's book, "Effective Writing for Engineers, Managers, Scientists" as a text. |
|
|
|
| Speech Technology Laboratory
Santa Barbara, California and its parent company Matsushita Electrical Industrial Co., Central Research Laboratory, Moriguchi, Japan. |
Research Engineer. September 1985 - June 1995.
The first project at Speech Technology Laboratory in Santa Barbara was to continue development on the Harmonic Synthesizer, which could play back speech or music, at high quality, at a faster or slower rate of speed without altering the pitch or formants or musical quality of the original. Matsushita was at this time developing a voice operated soft drink vending machine for Coca Cola's 100th anniversary in 1986 and under a tight deadline required speech data to train their acoustic models for all American accents of the Coca Cola Company's products. I traveled around the US and collected data from Los Angeles, Chicago, New York, and Atlanta at Coca Cola headquarters. It was extremely important that the president of Coca Cola could operate the machine, and his English was spoken with a heavy Cuban accent. The deployment of the voice operated vending machine was successful. In 1986 I developed STL's first real time recognizer, using a TMS32010 DSP chip, performing 14th order cepstral analysis in real time (impossible on PCs at that time). This enabled STL to test its algorithms live. In order to make a truly hands free system, I researched the subject of automatic speech detection and found surprisingly few published papers. In 1987 I went on a business trip to the headquarters of STL's parent company, Matsushita Denki, in Kadoma, near Osaka, Japan. First task there was to work with the translation group at Wireless Research Laboratory (Musenken), post processing the English output. In parallel with this effort, I worked with a very new project using one of the earliest Object Oriented languages - Objective C - to develop a Rapid Prototyping system. This involved graphics programming on Sun workstations. The software could be used in other divisions of Matsushita to simulate the behavior of new consumer products, using a state diagram of the product as input. This software, named ADMIRE, won the 1988 Japan Software Technology Award (along with others: Ohtsu, Tsuga, Nishimura, Nakai). This work was done under Mr Kazuhiro Tsuga, with Masaru Nakai, Takashi Ohtsu, and Mariko Nishimura. By 1989 I was based in Matsushita's Central Research Laboratory in Moriguchi, working in their Speech Recognition group. During my years at CRL we implemented speech recognition in several products including a car navigation system, GPS system, and a small footprint low vocabulary recognizer to be used in toys and inexpensive consumer products. It became clear that the performance of the automatic speech detection had a great influence on the overall performance of the product. Complementing Matsushita's efforts in signal processing for speech recognition, and their advanced hardware design capability, I was working on the missing link: new and noise-robust algorithms for the automatic detection of speech. This resulted in several papers 1991-1994 and three patent applications. All three were approved (1995-1998). Worked primarily with Jean-Claude Junqua, Hisashi Wakita, Brian Hanson, Ted Applebaum in the US; and in Japan with Masahiro Hamada, Kenji Matsui, Eiichi Tsuboka, Noriyo Hara, Yumi Wakita, Kazuhiro Tsuga, Takashi Ohtsu, Masaaki Kitano, Jun-ichi Nakahashi, Takahiro Kamai, and Yuriko Suruga Junqua. |
|
|
|
| Hughes Aircraft Co.,
Ground Systems Group Fullerton, California |
Member of Technical Staff, September 1983 - September 1985.
Hardware Design of Spread Spectrum receiver offering high data rates in a reliable and jam resistant modem nicknamed LC20. Responsible for design, test, and operation of 7 cards comprising its major functions: noise-robust Correlator, live Test, Timing and Control. Involved extensive use of Schottky TTL and ECL -- Emitter Coupled Logic which is faster than TTL due to lower voltage swings across internal capacitance, but requires more care in construction of circuit paths. Used SPICE for modeling of high speed switching signals, and PCAD to aid in the pre-fabrication test of the circuits. Obtained a Secret level security clearance. Worked with Will White and Al Falone. |
|
|
|
| Student Internships | |
| Veterans Administration Hospital Palo Alto, California |
Biomedical Engineer, March 1983 - July 1983. . Designed and built a Z80 microprocessor circuit to interface a stenotype to a speech synthesizer. Unlike a typewriter, the stenotype enables speech generation fast enough to keep pace in a conversation up to 250 words per minute. Its phonetic input reduces TTS dictionary mistakes and enables more creative expression by the user. Worked with Richard Steele. |
|
|
|
| Speech Communication Research Lab, Los Angeles | Research Assistant, January 1982 - September 1982, part time. Analysis and synthesis of speech using ILS on a PDP 11 system with real time interface to reel-to-reel low noise tape recording equipment. Performed system administration: periodic backups, purchasing of a VAX, debugging the audio data acquisition hardware. Consulting in noise removal from an in-flight recording prior to the crash of a private airplane. Worked with June Shoup, director. |
|
|
|
| USC Programming & Data Processing Department | Teaching Assistant, 1981. Lecture to a class of thirty students on Fortran-77. Hold an office hour, grade papers, and help students with their projects. Evaluate performance and grade papers of over 100 students. Administered examinations. |
|
|
|
| Bell Laboratories, Holmdel, New Jersey |
Technical Assistant, Summer 1980 and 1981. Design and construct digital hardware to test a circuit card containing a bank of 24 Echo Canceler special purpose ASICs in-circuit. Learned the Echo Canceler's adaptive filtering operation, and its usage in the telephone network in a TASI environment (Time Allocated Speech Interpolation). Worked with Al Maione, Chih-Yu Kao. |
|
|
|
| USC Civil Engineering Department | Research Assistant, 1978. Analysis of movement of a structure during an earthquake, modeling the soil by numerical solution of a differential equation. Graphical display on Tektronix, and calculation on Vax using VMS. Worked with Dr. Udwadia. |
|
|
|
| USC Testing Bureau | Interface for rapid scoring of student tests, 1978. Program an Altair 8080 interface to IBM Selectric Typewriter; proctor tests including LSAT, GRE, GMAT. Worked with Bob Jones, director. |
|
|
|
| Hughes Aircraft Co., Radar Systems Group. |
Student Engineer, Summer 1979. Hardware design and implementation for testing a Programmable Data Processor which performs Fast Fourier Transform of high speed data in real time using Emitter Coupled Logic (ECL). Worked with Joe Larabell. |
|
|
|
| Volunteer activities |
|
| Formal |
Courses include Linear Estimation Theory (Kailath), Communications Engineering Principles (Lusignan), Introduction to Statistical Signal Processing (Gray), Signal Detection and Nonlinear Estimation (Kailath, El Gamal), Digital Filtering (Treichler), Introduction to Fourier Optics (Goodman), Fourier Transform and its Applications (Bracewell), Adaptive Systems (Widrow). Masters Project: analysis of frequency domain LMS adaptive filter in white noise (under Dr. Bernard Widrow) Bachelor of Science. Graduated in the top 20% of all Engineering students. Concentration in Signal Processing. Courses include: Numerical Analysis (Bekey), Probability & Statistics (Papadapoulos), Engineering Honors Colloquium (Rusch). Additional Major: East Asian Languages and Cultures -Japanese (Dr. Han) Additional Concentration: Electronic Music Courses include Signals and Systems, Analog Circuits, Numerical Analysis |
| post-graduate indulgences: |
|