About MICASE
The history, purpose, and ideas behind the corpus of academic speech.
1. What is MICASE?
2. What is ‘academic speech’?
3. The ideas behind MICASE
4. How we chose which speech events to record
5. How we collected the MICASE data
6. A brief history of MICASE
7. How MICASE benefits our community
8. The MICASE team through the years
MICASE is a collection of nearly 1.8 million words of spoken academic English (see next section for a definition of academic speech), all recorded on the University of Michigan campus, and transcribed into searchable documents. You can search MICASE for words and phrases to see how English is being used in academic settings.
Academic speech is defined as that speech which occurs in academic settings. In other words, it is not pre-defined as something like ‘scholarly discussion’. In academic settings, we might, for example, find such speech acts as jokes, confessions, and personal anecdotes, as well as the more prototypical definitions, explanations and intellectual justifications. Therefore, the real question is how we define ‘academic setting’. We have taken an open yet circumscribed stance on this.
The speech events included in the corpus include: small and large lectures (62), public interdisciplinary or departmental colloquia (13), discussion sections (9), student presentations (11), seminars (8), undergraduate lab sessions (8), lab group and other meetings (6), one-on-one tutorials (3), office hours (8), advising consultations (5), dissertation defenses (4), study groups (8), interviews (3), campus/museum tours (2), and service encounters (2).
On the other hand, we have excluded certain events that occur on campus but would not be significantly different if they had occurred in other locations. For example, we did not record food-ordering sequences in university food outlets or discussions among those who work in the university’s plant or grounds departments. These speech events we do not consider central or particular to a university community’s educational mission.
In 1997, the English Language Institute (ELI) at the University of Michigan started the MICASE project. Dr. Rita Simpson was the original project manager, working with Professor John Swales (faculty advisor) and Dr. Sarah Briggs (testing advisor).The project was driven by two questions:
Because MICASE aimed to record a wide range of academic speech, our sampling goals spanned fifteen different types of speech events and four major academic divisions within those types (Humanities and Arts, Social Sciences, Biological and Health Sciences, and Physical Sciences). We adopted stratified random sampling. Each recording is classified according to speech event type, a pre-assigned number indicating the academic discipline, two letters representing the majority of participants in the event (e.g. junior undergraduate, senior faculty, staff), and a final three digit sequence to track chronologically when the tape was recorded. For example, transcript number LEL115SU015 is a recording of a large lecture (LEL) in anthropology (115), at the senior undergraduate level (SU), and is the 15th speech event recorded for MICASE.
All recordings were made with a digital audio tape recorder with two external stereo microphones, and at selected events, a video recorder. Two researchers attended most speech events in order to identify speakers and facilitate transcription by taking field notes about nonverbal contextual information; however, in small groups (e.g. advising sessions, office hours, study groups) where an observer’s presence would have been intrusive, the research assistants left the room after the equipment was set up. All speech was recorded with written consent from the major speakers and verbal consent from other participants. Demographic information (sex, age group, university position, and native language) was collected from each speaker on a form distributed at the end of each event. The speaker information is included in the header of each transcript and is also entered into a separate database. All DAT recordings were captured and stored as MP3 format sound files for use with our computer transcription program, SoundScriber, and have also been re-digitized as WAV format files and transferred to data CD for archival purposes.
In June 2001, the first phase of the project was completed, with over 190 hours of academic speech recorded. In April 2002, the transcription and proofing of all transcripts was completed (approximately 1.8 million words).
Then, in May 2002, the original search interface was launched, with a redesigned version released in June 2007. It has grown in popularity each year since its release, approaching nearly 140,000 hits in 2006. In 2009, we are excited for the release of a number of new features and support tools, including new MICASE online demos and new resources for EAP/ESL teachers!
The project is currently managed by Dr. Ute Römer (Michigan Corpus Linguistics, Unit Director), with support from Dr. Matthew Brook O’Donnell (Post-doctoral Research Fellow). However, the MICASE project has only been possible with the help of a long list of talented faculty, staff, and research assistants over the years.
The ELI has committed resources to MICASE for a series of interlocking reasons:
We hope the MICASE project continues to provide helpful resources for researchers, EAP teachers, and English language learners.
MICASE Team in 2001

left to right: John Swales, Sarah Briggs, Janine Ovens, Rita Simpson
MICASE Team in June 2007

left to right: Yung-Hui Chien, Jesse Sielaff, Stefanie Wulff, Sheryl Leicher, Annelie Adel, John Swales
MICASE Team in November 2007

left to right: Stefanie Wulff, John Swales, Ute Römer, Nick Ellis, Jesse Sielaff, Yung-Hui Chien, Merche Querol (ELI visiting scholar)
MICASE Team in November 2008

left to right: Geoff Ho, Jesse Sielaff, Emily Lin, John Swales, Miranda Kozman, Nick Ellis, Matt O’Donnell, Ute Römer
MICASE Team in June 2009

left to right: John Swales, Ute Römer, Edwin Teng, Matt O’Donnell, Miranda Kozman, Emily Lin, Geoff Ho, Madison Stuart
The history, purpose, and ideas behind the corpus of academic speech.
Discover all the features of the online searchable database. Get help with filtering and narrowing your searches, reading concordance results, and understanding the mark-up of our transcripts.
Order the transcripts, sound files, and the handbook here.
The sound files for many of our MICASE transcripts are available here, organized by speech event type.
These findings using MICASE data, give us a glimpse into the world of academic speech.
Explanation of tags, colors, punctuation, and other mark-ups used in our online transcripts, as well as word counts and number of transcripts by categories.
Our how-to use MICASE information complied into one downloadable document.
How is MICASE being used by applied linguists in these fields?
A list of publications, presentations and teaching materials using MICASE (1999-present).