Overview
The Paralinguistic Paraphrase Corpus contains 3423 English spoken phrases (focused: 2835, without focus: 588) recorded by 9 British native English speakers and 1697 English paraphrases, which reflect the corresponding implications.
Here are some example data in the corpus:
Original text (focused word underlined) | Focused utterance | Paraphrase |
---|---|---|
A biker enjoys a coffee | There is one biker enjoying a coffee | |
A biker enjoys a coffee | The person enjoying a coffee is a biker |
They were collected in 2022 to investigate the possibility of mapping paralinguistic cues, how it is said, into the linguistic information, what is said. For a description of the corpus, see:
Original:
Naoaki Suzuki, Satoshi nakamura. (2022) Representing 'how you say' with 'what you say': English corpus of focused speech and text reflecting corresponding implications. Proc. Interspeech 2022, 4980-4984.
Revised:
Naoaki Suzuki, Satoshi nakamura. (2022) Representing 'how you say' with 'what you say': English corpus of focused speech and text reflecting corresponding implications. https://arxiv.org/abs/2203.15483
Terms of Use
License: This corpus is distributed under the Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA) license.
Citation: If you use this data, please cite both our paper above and the original paper of the Flickr8k corpus: (paper)
Data Downloads
You can download the Paralinguistic Paraphrase Corpus upon the agreement of the Terms of Use above.
Contact
Satoshi Nakamura, Nara Institute of Science and Technology
s-nakamura [0x40] is.naist.jp
Naoaki Suzuki, Nara Institute of Science and Technology
suzuki.naoaki.sg4 [0x40] is.naist.jp