The reasons for annotating structural events in spontaneous speech are straightforward. Raw streams of words do not convey complete information because the structural information beyond the words (metadata) is of the same importance. The structural information is critical both to increase human readability of the transcripts and to allow applying downstream NLP methods typically requiring a fluent and formatted input. The metadata annotation can be viewed as a post-processing step to the standard verbatim transcription. It involves identification of a range of spontaneous speech phenomena (fillers and disfluencies) and insertion of syntactic/semantic breakpoints (SUs) to the flow of speech. Under SimpleMDE, annotators identify fillers, deletable regions of words within edit disfluencies and SUs ("syntactic/semantic" units). Transcripts annotated for metadata can be "cleaned up" to enhance readability; for instance, DelRegs and fillers might be removed and each SU presented on a separate line within the transcript.