Paralinguistic Paraphrase Corpus


Overview


The Paralinguistic Paraphrase Corpus contains 3423 English spoken phrases (focused: 2835, without focus: 588) recorded by 9 British native English speakers and 1697 English paraphrases, which reflect the corresponding implications.

Here are some example data in the corpus:

Original text (focused word underlined) Focused utterance Paraphrase
A biker enjoys a coffee There is one biker enjoying a coffee
A biker enjoys a coffee The person enjoying a coffee is a biker

They were collected in 2022 to investigate the possibility of mapping paralinguistic cues, how it is said, into the linguistic information, what is said. For a description of the corpus, see:

Original:

Naoaki Suzuki, Satoshi nakamura. (2022)
Representing 'how you say' with 'what you say': English corpus of focused speech and text reflecting corresponding implications.
Proc. Interspeech 2022, 4980-4984.

Revised:

Naoaki Suzuki, Satoshi nakamura. (2022)
Representing 'how you say' with 'what you say': English corpus of focused speech and text reflecting corresponding implications.
https://arxiv.org/abs/2203.15483


Terms of Use


License: This corpus is distributed under the Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA) license. Creative Commons License

Citation: If you use this data, please cite both our paper above and the original paper of the Flickr8k corpus: (paper)


Data Downloads


You can download the Paralinguistic Paraphrase Corpus upon the agreement of the Terms of Use above.


Contact


Satoshi Nakamura, Nara Institute of Science and Technology

s-nakamura [0x40] is.naist.jp

Naoaki Suzuki, Nara Institute of Science and Technology

suzuki.naoaki.sg4 [0x40] is.naist.jp