TRICKLET Translation Research in Corpora, Keystroke Logging and

Transcription

TRICKLET Translation Research in Corpora, Keystroke Logging and
Changes of word class during the
translation process
Insights from a combined analysis of keystroke logging
and eye-tracking data
Tatiana Serbina, Sven Hintzen, Adjan Hansen-Ampah, Paula Niemietz,
Stella Neumann
Translation in Transition, Germersheim, 29.-30.01.2015
A HumTec Boost Fund Project funded by the Excellence Initiative of the German
State and Federal Governments
Overview






2
Translation shifts
Grammatical complexity
Aims of the study
Methodology
Product-based analysis: word class changes ST vs. TT
Process-based analyses: eye-tracking analysis, word class
changes in intermediate versions of translations
Empirical translation studies
 Product-based studies
 Method: corpus analyses
 Typical research questions: translation shifts or translation
properties
 Process-based studies
 Method: translation experiments (frequently keystroke
logging and eye-tracking)
 Typical research questions: translators’ styles, levels of
expertise and their effect on the translation process
 Our research
 Treating keystroke logs as a corpus (cf. e.g. Alves &
Magalhães 2004, Alves & Vale 2009, 2011)
 A combination of product-based and process-based
perspectives
3
Translation shifts


Translation shifts: differences between source and target texts,
e.g. part of speech change or change of semantic perspective
(Čulo et al. 2008, Cyrus 2009, Halverson 2007)
Changes of word class – transpositions (Vinay and Dalbernet
1958/1995, 36)
EO: Crumpling a sheet of paper seems simple and doesn't require
much effort, but explaining [why the crumpled ball [behaves]Verb the
way it does]Clause is another matter entirely.
GTrans: Ein Blatt Papier zusammen zu knüllen, erscheint einfach
und erfordert wenig Anstrengung; [die [Verhaltensweise]Noun des
Papierknäuels]NP zu erklären, ist dagegen eine völlig andere Sache.
(KLTC PROBRAL GT7)
4
Word classes – contrastive
difference

German: nominal word classes - 40.21%
verbal classes - 22.53%  ratio of 1.784
 English: nominal word classes - 41.39%
verbal word classes - 25.47%  ratio of 1.625
 More pronouns in German (8.45%) than in English (5.46%)
 German appears to be more nominal than English (HansenSchirra, Neumann, and Steiner 2012, 77-78)

5
In the translations into German: more shifts from verbs to nouns
and fewer shifts from nouns to verbs than in the opposite
translation direction (Čulo et al. 2008, 50)
Grammatical complexity

Association with different levels of grammatical complexity
 Verbs – possible indicator that the process is realized canonically
through a clause
 Nominalizations – may result in a more condensed and thus
grammatically more complex version (Halliday and Matthiessen
2014, 715).
EO: [After the [crumpling]Noun of a sheet of thin aluminized Mylar]PP, the
researchers placed it inside a cylinder.
GTrans: [Nachdem sie ein dünnes Blatt aluminiumbeschichtetes Mylar
[verknittert]Verb hatten]Clause, gaben sie es in einen Zylinder. (KLTC
PROBRAL GT3)

Translation: understanding of the more complex units in the ST
could involve their paraphrase with grammatically more simple
structures in the TT
(Steiner 2001, Hansen-Schirra, Neumann, and Steiner 2012, 257-261)
6
Aims of the study


Analysis of POS distribution and shifts between main word
classes
This study concentrates on nouns & verbs and investigates the
cognitive effort during the translation process depending on
 the word class in the original
 type of shift in the translation


Analysis of intermediate versions in the keystroke logging data
Assumptions:
 due to the contrastive difference, the translation direction EnglishGerman may be characterized through shifts from verbs to nouns
 due to the process of understanding related to the grammatically
dense noun phrases in ST, translations into German may be
characterized through shifts from nouns to verbs (in the
intermediate or final versions)
7
Our translation process data

Translation experiment (Neumann et al. 2010)
 Translation direction: English-German

Subjects
 8 professional translators
 8 physicists

Material
 Two versions of an authentic text with ten integrated stimuli
 Abridged version of a popular-scientific text published in the
journal Scientific American Online

Apparatus
 Tobii 2150 remote eyetracker, software Tobii Studio 1.5
 Keystroke logging software Translog
8
Keystroke Logged Translation
Corpus

The corpus consists of:
 2 versions of the original (source texts)
 16 translations (target texts)
 16 log files (process texts)

Corpus size (comprising STs and final TTs):
approx. 3,650 words
 Corpus register: Popular Scientific writings
(Serbina, Niemietz, and Neumann, forthcoming)


9
Automatic POS annotation of ST and TT using TreeTagger
(Schmid 1994)
Manual alignment between ST and TT words using the
alignment tool (Hansen-Ampah 2014) based on the alignment
guidelines (Samuelsson et al. 2010)
Methodology


Manual extraction of ST words belonging to main word classes
and the aligned TT words
Translation pairs selected for further analysis:
 Shifts between nominal and verbal variants
 Random samples of verbs and nouns that do not contain a shift in
the final translation


10
Keystroke data: identification of intermediate versions for the
selected ST words
Eye-tracking data: calculation of total fixation duration as a
concrete indicator of cognitive effort for the selected ST words
POS distribution in ST and TT
English STs
German TTs
Nouns
32,75% (113/345)
27% (882/3267)
Verbs
17,39% (60/345)
15,81% (511/3267)
Adjectives
11,30% (39/345)
9,77% (319/3267)
Adverbs
4,35% (15/345)
5,17% (169/3267)


11
More nouns and verbs in English originals than in German
translations
Technical problem: compound nouns counted as several
nouns in English but as one in German (Čulo et al. 2008,
49)
Types of word class shifts
Absolute numbers
12
% of all shifts
VERB → NOUN
37
49,95%
ADJ → NOUN
23
17,04%
NOUN → VERB
16
11,85%
VERB → ADJ
14
10,37%
ADV → PP
14
10,37%
NOUN → ADJ
10
7,41%
VERB → ADV
9
6,70%
ADV → ADJ
6
4,44%
NOUN → ADV
4
2,96%
ADJ → ADV
2
1,48%
Types of word class shifts II

13
English
ST verb
English
ST noun
No shift
350
776
Shift
60
30
The translation direction
English-German is
characterized through shifts
from verbs to other word
classes, in particular to nouns
Cognitive effort I


Does the translation of nouns require more cognitive effort than
the translation of verbs?
Cognitive effort is measured using log-transformed values for
total fixation duration normalized per character
Means: noun -1.9
verb -1.78
t = -0.7, df = 94.85, p-value = 0.48
14
Cognitive effort II

Means: n-v -2.05
v-n -1.65
t = -1.57, df = 33.1, p-value = 0.13
 Slightly lower mean for the total
fixation duration associated with shifts
from nouns to verbs could be
potentially explained through reduction
of grammatical complexity
EO: Instead of collapsing to a final fixed size, the height of the crushed ball
continued to decrease, even three weeks [after the [application]Noun of
weight]NP.
GTrans: Statt zu einer endgültigen festen Größe zusammenzufallen, nahm
die Höhe des zusammengeknüllten Papierballs weiter ab, und zwar auch
noch drei Wochen, [nachdem das Gewicht [angewendet]Verb wurde]Clause.
(KLTC PROBRAL GT5)
15
Intermediate versions I

Verb to Noun shifts: Verb  Verb  Noun (3x)
Verb  Noun  Noun (1x)
EO: Crumpling a sheet of paper seems simple and doesn't require much
effort, but explaining [why the crumpled ball [behaves]Verb the way it
does]Clause is another matter entirely.
GTrans_i: Ein Blatt Papier zusammen zu knüllen, erscheint einfach und
erfordert wenig Anstrengung, jedoch zu erklären, [warum der
zeras Papierknäuel sich so [verhält]Verb, wie es das
tut]Clause, ist eine völlig andere Sache.
GTrans: Ein Blatt Papier zusammen zu knüllen, erscheint einfach und
erfordert wenig Anstrengung; [die [Verhaltensweise]Noun des
Papierknäuels]NP zu erklären, ist dagegen eine völlig andere Sache. (KLTC
PROBRAL GT7
16
Intermediate versions II

Noun to Verb shifts: Noun  Noun  Verb (1x)
Noun  Verb (Verb)  Verb (2x)
EO: [After the [crumpling]Noun of a sheet of thin aluminized Mylar]PP, the
researchers placed it inside a cylinder.
GTrans: [Nachdem sie ein dünnes Blatt aluminiumbeschichtetes Mylar
[verkrumpelt]Verb hatten]Clause, gaben sie es in einen Zylinder.
GTrans: [Nachdem sie ein dünnes Blatt aluminiumbeschichtetes Mylar
[verknäuelt]Verb hatten]Clause, gaben sie es in einen Zylinder.
GTrans: [Nachdem sie ein dünnes Blatt aluminiumbeschichtetes Mylar
[verknittert]Verb hatten]Clause, gaben sie es in einen Zylinder. (KLTC
PROBRAL GT3)
17
Conclusion & Outlook


An application of a Keystroke Logged Translation Corpus to
triangulate product and process data
Shifts from verbs in the ST to nouns in the final TT
 the most pervasive type of shifts in the translation direction
English-German
 Verbs shifted to nouns are fixated slightly longer than nouns
shifted to verbs

Both types of shifts can involve intermediate stages (either ST
POS or TT POS, i.e. a synonym of the item in the final TT)

Determining cognitive effort based not only on the shift in the
final but also intermediate translation versions  more data
points required
Taking into account further indicators of cognitive effort (further
combining eye-tracking and keystroke logging data streams)

18
e-cosmos platform


Creating a web-based platform for different scenarios of
multimodal data integration and analysis
Translation data
 Integration: text, keystroke logging and eye-tracking data
 Identification of word tokens in the intermediate translation
versions and their linguistic annotation
 Query tool for quantitative analyses of product and process data
(cf. Carl & Jakobsen 2006)
e-cosmos platform
Linguistics
19
Computer Science
Information Management
Thank you for your attention!
Tatiana Serbina
serbina@anglistik.rwth-aachen.de
RWTH Aachen University
Templergraben 55
52056 Aachen
www.rwth-aachen.de
References
Alves, Fabio, and Célia Magalhaes. 2004. “Using Small Corpora to Tap and Map the Process-product Interface in Translation.” TradTerm
10: 179–211.
Alves, Fabio, and Daniel Couto Vale. 2009. “Probing the unit of translation in time: Aspects of the design and development of a web
application for storing, annotating, and querying translation process data.” Across Languages and Cultures 10 (2): 251–73.
Alves, Fabio, and Daniel Couto Vale. 2011. “On drafting and revision in translation: On drafting and revision in translation: A corpus
linguistics oriented analysis of translation process data.” Translation: Computation, Corpora, Cognition 1: 105–22.
Carl, Michael, and Arnt LykkeJakobsen. 2009. Objectives for a query language for user-activity data. In 6th International Natural
Language Processing and Cognitive Science Workshop, Milano, Italy.
Čulo, Oliver, Silvia Hansen-Schirra, Stella Neumann, and Mihaela Vela. 2008. “Empirical studies on language contrast using the EnglishGerman comparable and parallel CroCo corpus.” In Proceedings of the LREC 2008 Workshop “Building and Using Comparable Corpora”,
47–51. Marrakesh, Morrocco. http://www.dfki.de/lt/publication_show.php?id=3991.
Cyrus, Lea. 2009. “Old concepts, new ideas: Approaches to translation shifts.” MonTI. Monografías de Traducción e Interpretación 1: 87–
106.
Halliday, Michael A. K., and Christian M. I. M. Matthiessen. 2013. An introduction to functional grammar. London: Arnold.
Halverson, Sandra L. 2007. “A cognitive linguistic approach to translation shifts.” In The study of language and translation, edited by Willy
Vandeweghe, Sonia Vandepitte, and Marc van de Velde, 105–21. Amsterdam: Benjamins.
Hansen-Schirra, Silvia, Stella Neumann, and Erich Steiner. 2012. Cross linguistic corpora for the study of translations: Insights from the
language pair English-German. Berlin: de Gruyter.
Hansen-Schirra, Silvia, Stella Neumann, and Mihaela Vela. 2006. Multi-dimensional annotation and alignment in an English-German
translation corpus. In Proceedings of the 5th Workshop on NLP and XML (NLPXML-2006): Multi-Dimensional Markup in Natural
Language Processing, EACL 2006, (pp. 35–42). Trento, Italy.
Neumann, Stella, Adriana Pagano, Fabio Alves, Piritta Pyykkönen, Igor da Silva. 2010. Targeting (de-) metaphorization: Process-based
insights. The 22nd European Systemic Functional Linguistics Conferece and Workshop. 9th–11th July 2010. Koper, Slovenija.
Serbina, Tatiana, Paula Niemietz, and Stella Neumann. Forthcoming. Development of a keystroke logged translation corpus. In: Claudio
Fantinuoli und Federico Zanettin (eds.): Parallel corpora for translation studies: Language Science Press.
Steiner, Erich. 2001. Translations English-German. Investigating the relative importance of systemic contrasts and the text-type
"translation". In: SPRIKreports Reports of the Project Languages in Contrast 7, S. 1–49.
Vinay, Jean-Paul, and Jean Darbelnet. 1995 (1958). Comparative stylistics of French and English: A methodology for translation.
Amsterdam: Benjamins. Translated and edited by Juan C. Sager and M.-J. Hamel.
21