Presentation by Dr. François Massion

Transcription

Presentation by Dr. François Massion
The Practice of
Terminology work
Dr. François Massion, D. O. G. Dokumentation ohne Grenzen GmbH
(francois.massion@dog-gmbh.de)
May 2014
Tekom Belgium - © D. O. G. GmbH
1
Overview
The most important aspects of terminology work
Use of terminology
Terminology extraction
Term entry: Models and concepts
Management and distribution
Programs and standards
May 2014
Tekom Belgium - © D. O. G. GmbH
2
Mr. Smith

This is Mr. Smith, design engineer
I am the "thing"

Mr. Smith has
designed a "thing"
May 2014
Tekom Belgium - © D. O. G. GmbH
3
Journey of a concept
This is my new thing.
I call it A-12 Timer
I get a quote for the
production of the electronic
timer
Developper
I need 300 timing
circuits
Purchasing
Customer
We will start a sales
campaign for the
timer A-12
Now the
time emitter A-12
is being produced.
Sales
May 2014
Supplier
Tekom Belgium - © D. O. G. GmbH
4
Case study

Let's show something ...
May 2014
Tekom Belgium - © D. O. G. GmbH
5
Test

How do you call this thing “N”?








May 2014
"soap dispenser"
"detergent dispenser"
"detergent drawer"
"detergent tray"
"soap drawer"
"洗涤剂抽屉"
"清洁剂抽屉"
"洗洁精抽屉"
Tekom Belgium - © D. O. G. GmbH
6
50 Years ago
Hand made documentation
May 2014
Tekom Belgium - © D. O. G. GmbH
7
What has changed since?
Modularization and re-use
May 2014
Tekom Belgium - © D. O. G. GmbH
8
On the authoring side
Modules
Written by Mr. Smith, June 2007
May 2014
Written by Mr. Jones, April 2009
Tekom Belgium - © D. O. G. GmbH
9
On the translation side
Sentences (or "segments")
Machine translated, April 2007
(Change Hidden Key : Input the new hidden key .)
Translated by Mr. Martinez, May 2010
Translated by Mr. Caballero, Oct. 2008
Translated by Ms. Fernandez, July 2010
May 2014
Tekom Belgium - © D. O. G. GmbH
10
Growing importance of terminology
Globalization of the world economy means:
 More trade and competition
 More products and shorter
product development cycles
(example: iPhone)
 More laws and regulations
 This leads to:




May 2014
More documents in shorter time intervals
More translators and authors
More frequent updates
More Languages
Tekom Belgium - © D. O. G. GmbH
11
New translation processes
<XML>
Content Management
System
Data
bases
Machine Translation (MT)
Translation Memory
System
May 2014
Tekom Belgium - © D. O. G. GmbH
Documents
"Single Source Publishing"
12
Terminology and the outside world


Within the company: Standardize
With the outside world: Understand and be
understood





Customers (sales, support, user-friendly documentation)
Government & institutions
Suppliers
Public at large
Know and use the language of the outside world
May 2014
Tekom Belgium - © D. O. G. GmbH
13
Terminological influence (in %)
120%
100%
80%
60%
40%
100%
75%
50%
20%
0%
May 2014
Own
publications
Customers
30%
Suppliers
Tekom Belgium - © D. O. G. GmbH
Govt. Agencies
10%
Public
14
Terminology and knowledge
May 2014
Tekom Belgium - © D. O. G. GmbH
15
Terminology and knowledge
William Whewell (1794-1866)
Whewell was a co-founder and president of the
British Association for the Advancement of Science.
He was a Master of Trinity College, Cambridge. He
wrote two important studies on the history and
philosophy of science. He coined the word
"scientist" in 1833.
The philosophy of the inductive sciences, Band 1, 1840
May 2014
Tekom Belgium - © D. O. G. GmbH
16
Terminology and knowledge
Knowledge is a dynamic development process
 Terminology helps ordering and transfer
knowledge
 Break down knowledge in its components:






May 2014
Creation of concepts
Concepts may be different between languages!
Relationships between concepts
(e.g. "Belongs to")
Add clarity with definitions
Work out semantic characteristics
(with respect to terminology gap)
Tekom Belgium - © D. O. G. GmbH
17
Use of terminology
Terminology extraction
Term entry: Models and concepts
Management and distribution
Programs and standards
May 2014
Tekom Belgium - © D. O. G. GmbH
18
Types of extraction
Terminology
extraction
Manual
Monolingual
May 2014
Automatic
Multilingual
Monolingual
Multilingual
Tekom Belgium - © D. O. G. GmbH
Interactive
Monolingual
Multilingual
19
Exercise 1: Extraction
(downloaded from the Internet) on 28.04.2014
May 2014
Tekom Belgium - © D. O. G. GmbH
20
Small exercise

Which words do you want to extract?
May 2014
Tekom Belgium - © D. O. G. GmbH
21
Extraction in the source language


Which words do you want to extract?
Which word categories are considered?
– Grammatically:
• Nouns, verbs and adjectives
• Bi-grams, tri-grams and collocations
– Content:
•
•
•
•
May 2014
Specific technical term of a company
General technical terms
Abbreviations
General terms
Tekom Belgium - © D. O. G. GmbH
22
How many terms shall be extracted?

Examples:
Manufacturing company: 3-10,000 terms per
language
 Steel plant: 58,000 entries!
 Open biomedical ontologies: 1.5 million concepts!


Optimal scope
Depends on goals, time, and resources
 Don’t forget the maintenance
 If little time: only the very specific terms for a
domain / company

May 2014
Tekom Belgium - © D. O. G. GmbH
23
Terminology and goals
Goals have an impact on terminology work
 Examples of goals:

1. Reduce costs of translation
through standardization
2. Facilitate trade between
countries
3. Reach more customers
4. Organize knowledge
5. Improve communication
May 2014
Tekom Belgium - © D. O. G. GmbH
Tools
Data model
GOALS
Type of
content
# of entries
24
Our vision

Terminology "at the press of a button"
Terminology
entries
Start
May 2014
Tekom Belgium - © D. O. G. GmbH
25
But ...


Text = highly complex system of lexical entities
Automatic recognition of:





May 2014
Various appearances of same term ("lemmatize")
Collocations: "user interface" vs. „some interface"
Lexical variants:
„program", „application", „software", „tool "...
Correct relations (ambiguity):
„Check settings" =
„Settings of the check" or „Check the settings"?
Correct grammatical category:
"Copy" = verb or a noun?
Tekom Belgium - © D. O. G. GmbH
26
Learning from computational linguistics
Methods
Nothing
comes
out of it!
May 2014
Tekom Belgium - © D. O. G. GmbH
27
Computational linguistics
Objective: Understand natural languages in order
to process them with programs
 Natural Language Processing (NLP)
 Create a language model, e.g. recognize
components and their relations



Extraction of patterns such as Adj + Noun, Noun +
Noun.
Application examples:



May 2014
Machine translation (MT)
Terminology extraction
Information mining
Tekom Belgium - © D. O. G. GmbH
28
Word analysis: Morphology


Analytical languages (such as Chinese) and synthetic languages
with affixes and inflections (case, number, gender, etc. ).
Example: Drukontluchtingsklep / 减压阀 / pressure relief valve /
vanne de dépressurisation
Stemming: Reduce to root form of a word (=stem):


Lemmatization: Reduction to base form of a word, lexical entry
(=lemma)


Activated, activation, actively, actives  activ (stem)
activated  activate (verb)
N-grams: bi-gram, tri-gram = sequences of N consecutive words.
Often adjacent, but not always:
EN "[visual] clogging indicator"
FR "indicateur [visuel] de colmatage"
May 2014
Tekom Belgium - © D. O. G. GmbH
29
Extraction steps
Corpus
Import
• Remove format and punctuation
• Create list of words (single words, n-grams)
Clean-up
• Remove duplicates, unwanted words, stop words
• Lemmatize
Enter data
• Add information according to data model (status, etc. )
• Add equivalents in other languages
May 2014
Tekom Belgium - © D. O. G. GmbH
30
Which types of words? Raw corpus
Belangrijk!
Zorg voor voldoende plaats voor onderhoud en instandhouding!
Voldoende ruimte inplannen voor het openen van de aansluitkast en de elektrische
aansluiting en de eventuele frequentieomzetter inplannen.
Het betonfundament moet
geheel zijn uitgehard,
voldoende stijfheid hebben (minimaal klasse X0 conform DIN EN 206),
een horizontaal en effen oppervlak hebben,
vibraties, krachtinvloeden en stoten kunnen opnemen
zo zijn bemeten, dat de stervormige grepen (004) op de filterdeksel (003) handmatig
moeten kunnen worden geopend.
Plaats de pomp als de vloer aan deze voorwaarden voldoet.
May 2014
Tekom Belgium - © D. O. G. GmbH
31
Which types of words? First categorization
Belangrijk!
Zorg voor voldoende plaats voor onderhoud en instandhouding!
Voldoende ruimte inplannen voor het openen van de aansluitkast en de elektrische
aansluiting en de eventuele frequentieomzetter inplannen.
Het betonfundament moet
geheel zijn uitgehard,
voldoende stijfheid hebben (minimaal klasse X0 conform DIN EN 206),
een horizontaal en effen oppervlak hebben,
vibraties, krachtinvloeden en stoten kunnen opnemen
zo zijn bemeten, dat de stervormige grepen (004) op de filterdeksel (003) handmatig
moeten kunnen worden geopend.
Plaats de pomp als de vloer aan deze voorwaarden voldoet.





NewTerminology
Known Terminology
Stop words
Words to be qualified
Abbreviations/Acronyms/Product names
May 2014
Tekom Belgium - © D. O. G. GmbH
32
The Word/Excel method
Quick help in urgent situations
 Word:

1. Save file as a text file (=remove formatting)
2. Remove superfluous characters
3. Generate a list of words (replace blanks with
return)

Excel:
1. Delete duplicates
2. Remove stop words
3. Reduce to basic form („lemmatize")
May 2014
Tekom Belgium - © D. O. G. GmbH
33
Remove superfluous words (Excel)

Stop words (MATCH function)






General words (is, the, always... )
Existing terms in database
Save rejected words for later
clean-up actions  “recycling”
Stop words become more efficient
over time
Special stop words for group of words
(phrases)
Include existing terminology in stop words
list
May 2014
Tekom Belgium - © D. O. G. GmbH
34
Sorting out in Excel
step 1:
Insert an empty
column
step 2:
Enter symbol for
terms you want to
reject (or accept)
step 3:
Sort list with
this column
step 4:
Edit terms
step 5:
Save list of
rejected entries
for later re-use
May 2014
Tekom Belgium - © D. O. G. GmbH
35
Lemmatization
New column with formula
RIGHT(A1;2)
 Sort on endings
 Edit inflectional endings

May 2014
Tekom Belgium - © D. O. G. GmbH
36
Used Excel functions

Comparison of entries (here "A1 ") with lists
=MATCH(A1;MYSTOPWORDS;0) or
=MATCH(A1;$C$1:$C$1000;0)
[on German OS: "VERGLEICH"]


"MYSTOPWORDS" is a variable, self-defined name
Length of an entry (e.g. cell „A1")
=LEN(A1)

Two last characters from the right (of „A1")
=RIGHT(A1;2)
May 2014
Tekom Belgium - © D. O. G. GmbH
37
Bi-grams, tri-grams, n-grams (1/4)

What are N-grams?
Set groups of 1, 2, n words
 Example: „elektrische aansluiting"
 More frequent in some languages (English, Italian
etc. )
 Challenge: How do you recognize them?

May 2014
Tekom Belgium - © D. O. G. GmbH
38
Bi-grams, tri-grams, n-grams (2/4)

Copy the unsorted
list of words with an
offset of one cell up
May 2014
Tekom Belgium - © D. O. G. GmbH
39
Bi-grams, tri-grams, n-grams (3/4)
Sort in a text editor (Word, Notepad etc. ) +
replace tabs with spaces
 Copy in a column in XL
and evaluate (partial results)

May 2014
Tekom Belgium - © D. O. G. GmbH
40
Bi-grams, tri-grams, n-grams (4/4)

Excel result:

Review list manually
May 2014
Tekom Belgium - © D. O. G. GmbH
41
Bilingual extraction: Issues

No one-to-one symmetry:

1) Different morphology:


Dutch: frequentieomzetter (1 word); English: frequency converter (2 words);
Chinese: 目录(2 characters)
Chinese: No plural in Chinese symbols: treaty / treaties = 条约

2) Unmatching word categories

3) Different syntactical patterns

4) Semantical differences (terminology gap):

May 2014
Different nomenclatures (e.g. international classification of diseases (ICD) but differing
categories in many countries)
Tekom Belgium - © D. O. G. GmbH
42
Methodology
Approaches to extract bilingual terminology:
Option #1
 Align texts (e.g. translate with CAT)
 Extract separately key terms in both
languages and highlight them
 Bilingual term alignment
Option #2
 Align texts
 Sort according to length
 Shorter entries (up to 30 characters)
may be matching terms
Option #3
 Use extraction software like
Multiterm Extract
May 2014
Tekom Belgium - © D. O. G. GmbH
43
Equivalence and standards

Standard ISO 12620 describes
equivalence categories:





narrower
equivalent
quasi-equivalent (or near-equivalent)
broader
equivalent phrase
•
Qualifier assigned to a phraseological unit
in one language that expresses the same
semantic content as a phraseological unit in
another language.
Automatic translation of "time" in various languages
May 2014
Tekom Belgium - © D. O. G. GmbH
44
Tips & tricks (1)

Where do you find terminology?





Databases (IATE = Inter-Active Terminology for Europe, LEO)
Online dictionaries (e.g. : "inurl:terminology")
Collections of dictionaries
Discussion groups (www.proz.com)
Search engines

Helpful search methods




May 2014
Searched phrase in quotation marks
Search in multiple languages "socket" + "ist ein"
Search in multilingual sites: "site:ch"
term + "de-en" (example: Schraube de-en)
Tekom Belgium - © D. O. G. GmbH
45
Tips & tricks (2)

Use the Internet
(a) To validate an expression
(b) To find the frequency of alternatives
Example: "Schakelschema" in French?




May 2014
Schéma de câblage
Schéma des connexions
Schéma des circuits
Schéma électrique
Tekom Belgium - © D. O. G. GmbH
46
Tips & tricks (3)

Use the image search feature
Frequency converter = 变频器 ???
May 2014
Tekom Belgium - © D. O. G. GmbH
47
Use of terminology
Terminology extraction
Term entry: Models and concepts
Management and distribution
Programs and standards
May 2014
Tekom Belgium - © D. O. G. GmbH
48
Organizing the terminology
Classical dictionaries are sorted in alphabetical
order.
 No way to group words with the same meaning:




Classical dictionaries list all meanings of one
word.


Synonyms such as "application" and "software"
Abbreviations and full names: "afz." and "afzender",
"bank": what are the meanings of this word?
Same with multilingual dictionaries
May 2014
Tekom Belgium - © D. O. G. GmbH
49
The „semiotic triangle"
"
Electric socket
"
Mental representation = "concept"
Term (= designation)
(actual) object
„stopcontact“
 socket
 jack
 wall outlet
 junction box
May 2014
Tekom Belgium - © D. O. G. GmbH
50
What is the difference?
May 2014
Tekom Belgium - © D. O. G. GmbH
51
Concept / Term

Term-based terminologies
Traditional dictionaries
 Encyclopedias


Concept-based terminologies
The basic entry is a concept
 Can include multiple terms
(e.g. synonyms) and languages

May 2014
Tekom Belgium - © D. O. G. GmbH
52
But what is really a concept?

Reality can be looked at in different ways
Why is this the same?
Why is this different? 


Different visions and perceptions:
„drawer“ vs. „schuiflade“
„exit“  „uitgang“ / „afrit“
Influence of culture, educational system, language
community, classification systems … Translation issues
May 2014
Tekom Belgium - © D. O. G. GmbH
53
Concepts, terms and synonyms


Componential Analysis (CA): relevant
characteristics
Useful to define the degree of equivalence
between words in a language or between
languages
May 2014
Tekom Belgium - © D. O. G. GmbH
54
Synonyms - Homonyms

Meaning relationships
Concept
Term
Synonymy
1
Ta
Tb
Tc
Polysemy
1
2
3
T
Homonymy
1
2
3
T
T
T
May 2014
software
program
application
foot
mouse (pl. "mice")
mouse (pl. "mouses", "mice")
Tekom Belgium - © D. O. G. GmbH
55
Exercise 2: Data model
Which information should be added to the
terms?
 What are the hierarchical levels in the data
model?
 Which levels the additional information
should be allocated to?

May 2014
Tekom Belgium - © D. O. G. GmbH
56
Data categories


ISO 12620:2009-12 defines
more than 200 data categories
"Terminology and other
language and content resources
- Specification of data
categories and management of
a Data Category Registry for
language resources"




Concept oriented categories
Administrative categories
Term related categories
www.isocat.org
May 2014
Tekom Belgium - © D. O. G. GmbH
57
A data model (example)
Concept Level
Language level
Term level
May 2014
•
•
•
•
•
Definition
Source of definition
Image
Attribute
Comment
• Homonym available
(Y/N)
•
•
•
•
•
Status
Word Type
Usage
Example
Term comment
Tekom Belgium - © D. O. G. GmbH
Approved
To be checked
Authorized
Prohibited
Unclear
Not required
To be deleted
Noun
Verb
Adjective
Adverb
Proper Name
Abbreviation
Short Form
Display Text
58
Semantic enrichment
Adds important information to the meaning of
the concept
 Particularly important for complex or ambiguous
terms: "device", "disk" ...
 Instruments:






May 2014
Definition
Reference
Image
Contextual data (example)
Remarks
Tekom Belgium - © D. O. G. GmbH
59
What is a good definition?
 Definitions
must:
Have a coherent structure
 Give unique essential characteristics
(function, appearance, consistence)
 Be as short as possible

 Definitions
must not:
Consist of synonyms ("a dog is a hound")
 Be circular (A is B and B is A)
 Be negative (XYZ is no ABC)
 Be too narrow / broad

May 2014
Tekom Belgium - © D. O. G. GmbH
60
Define a touch screen






Touch screen
Bij een aanraakscherm of touch screen kan je met de vinger of een speciale pen (stylus genoemd)
een handeling op de monitor uitvoeren. Er komt dus geen toetsenbord of muis bij kijken.
touch screen
aanraakgevoelig scherm
Type beeldscherm dat reageert bij het aanwijzen met je vinger. Veel gebruikt bij infokiosken. Maakt
een toetsenbord quasi overbodig.
Aanraakscherm
beeldscherm waarmee je je computer bedient door bepaalde plaatsen aan te raken computers
Aanraakscherm
een beeldscherm dat ook als invoerapparaat voor een computer kan worden gebruikt door het
scherm aan te raken.
Aanraakscherm
Bij een aanraakscherm of touchscreen wijs je met de vinger of een speciale pen op de monitor om
een handeling uit te voeren. Hier komt dus geen muis of toetsenbord aan te pas. Het grootste
voordeel hiervan is dat de gebruiker in principe een oneindig aantal (virtuele) knoppen kan
aansturen met hetzelfde oppervlakte.
Aanraakscherm
Een aanraakscherm (vaak met het Engelse touchscreen aangeduid) is een beeldscherm dat ook als
invoerapparaat voor een computer of embedded system (zoals een smartphone of tablet) kan
worden gebruikt door het scherm aan te raken.
May 2014
Tekom Belgium - © D. O. G. GmbH
61
Usage of terminology
Terminology extraction
Term entry: Models and concepts
Management and distribution
Programs and standards
May 2014
Tekom Belgium - © D. O. G. GmbH
62
Distribution of terminology

Usage: What / whom for?




Applications




Work: Translator, interpreter, author, programmer…
Quality check
Information and knowledge
Translation Memories
Authoring tools,
quality assurance tools
Data bases etc…
Central repository


May 2014
Settle responsibilities and rights
Stores all modification from various users
Tekom Belgium - © D. O. G. GmbH
63
Exchange formats

Excel:


List with information in columns
List with information in cells
PDF, TXT, CSV, MS-Word
 XML based standards:





May 2014
TBX (TermBase eXchange)
OLIF (Open Lexicon Interchange Format)
MARTIF
(Machine readable terminological interchange format)
UTX (universal terminology eXchange)
[designed by AAMT (Asia-Pacific Association for Machine Translation)]
Tekom Belgium - © D. O. G. GmbH
64
List with term data in columns


Problem with variable number of columns
Def.
Image
EN1
XXX

PC
Attr.
EN2
Attr.
Computer
DE
Attr.
FR ...
Rechner
Many languages, many attributes: very large tables
May 2014
Tekom Belgium - © D. O. G. GmbH
65
List term data in rows


IDs must be allocated
Difficult exchange of data:


Database is three-dimensional
List is two-dimensional
ID
Def.
Image
Language
Term
1
DE
Rechner
1
EN
PC
1
EN
Computer
2
DE
Maus
Attr. 1
Attr. 2
..
N
May 2014
Tekom Belgium - © D. O. G. GmbH
66
Maintenance of terminology



Terminology is never complete:
 Status
 Optional fields
 Control the use of synonyms
 Detection of polysemous terms
 Use experience gained from Quality Assurance (texts,
translations, terminology).
Depending on the structure of the dictionary, efforts vary
(definitions or not, additional information, etc. )
Approx. 10% of the time needed to create new terminology
entries
May 2014
Tekom Belgium - © D. O. G. GmbH
67
Challenge: Manage changes
May 2014
Tekom Belgium - © D. O. G. GmbH
68
LookUp – a short introduction


Access via the internet or an intranet
Definition of the different user rights









Read
Comment
Write
Also for individual languages (notification by mail)
Free definition of data model
Definable queries
Suggestion system
Recognition of homonyms
Various export and import formats (XLS, TBX)
May 2014
Tekom Belgium - © D. O. G. GmbH
69
New dictionary
May 2014
Tekom Belgium - © D. O. G. GmbH
70
Access Data
http://88.79.119.98
 User: Tekom_May14
 Password: Antwerp_011

May 2014
Tekom Belgium - © D. O. G. GmbH
71
Exercise 3: Terminology quality
Which entries are not correct?
 Entry rules

Singular without article
 Uniform spelling
 One type/piece of information per data field

May 2014
Tekom Belgium - © D. O. G. GmbH
72
Is terminology always recognized?
Mit elektrischem Handrad,
Tasten für Achs-Richtung,
Vorschub, Eilgang, Not-Ausund Zustimmtaste, sowie
drei belegbaren
Funktionstasten.
3 unrecognized errors!
 In database and recognized
 In database and not recognized
 Not entered in database
May 2014
With electrical handwheel,
keys for axial direction axis
direction, feed, rapid
traverse, Emergency button
Emergency Stop button and
acceptance key as well as
three assignable functional
keys function keys.
Entries in the terminology database:
 Not-Aus-Taste
 Achsrichtung (Bindestrich bitte prüfen)
Tekom Belgium - © D. O. G. GmbH
73
Can you always rely on terminology?
Checking a project: Terminology is….
10%
15%
Superfluous
Missing
Not usable
60%
May 2014
Correct
15%
Tekom Belgium - © D. O. G. GmbH
74
Quality of terminology
Linguistic criteria
1.
2.
3.
4.
5.
6.
Grammar and language usage ("apple(s)")
Unambiguiousness:
Resistance = value or hardware?
Control = action or hardware?
Consistent spelling (timeout / time out / time-out).
(Consistency of terms, ISO standard 704:2000)
Comprehensibility
("Hippopotomonstrosesquippediliophobia" =
"the fear of long words“[Hip-po-pot-o-mon-stro-ses-quip-pe-da-li-o-phob-i-a])
Natural order (sealing, elastic)
Singular without article
May 2014
Tekom Belgium - © D. O. G. GmbH
75
Quality of terminology
Technical criteria




Format, no parenthesis, no
question marks
Data elementarity
(information is separated into
individual data categories)
Term autonomy in concept based
dictionaries
Comprehensiveness and
correctness of information
(e.g. status, mandatory fields)
May 2014
Tekom Belgium - © D. O. G. GmbH
76
Overview of the problems to be solved

Detection of synonyms


Special case: wrongly separated concepts
Special case: contextual synonyms (access point/point)
Detection of homonyms
 Obsolete entries
 Missing, wrong, contradictory attributes
 Definitions: Structure and content, homonyms
without definition
 Organization: References, relationships

May 2014
Tekom Belgium - © D. O. G. GmbH
77
Quality of terminology
Autonomy of terms
Adapted from: Birgit Wöllbrink – Tools für Terminologiemanagement in Tekom – Schriften zur Technischen Kommunikation Nr. 12, 2008, S. 81
May 2014
Tekom Belgium - © D. O. G. GmbH
78
Translation-friendly terminology

Takes into account the international use, "neutral with regard to culture"

Always use the complete form
respiratie-apparaat
huishoudelijk apparaat
espresso-apparaat
massage apparaat
politie apparaat
iOS-apparaat


respiratie-apparaat
apparaat
respiration device
household appliance
espresso machine
massager
police apparatus
iOS device
iOS-apparaat
Takes into account the issue of equivalence (e.g. narrower/broader
meaning): Definitions, fields for translation related information
Terms should be understandable for readers, who do not see the actual
object (machine, software, device, service).
May 2014
Tekom Belgium - © D. O. G. GmbH
79
Usage of terminology
Terminology extraction
Term entry: Models and concepts
Management and distribution
Programs and standards
May 2014
Tekom Belgium - © D. O. G. GmbH
80
Terminology programs
May 2014
Tekom Belgium - © D. O. G. GmbH
81
Translation window
May 2014
Tekom Belgium - © D. O. G. GmbH
82
Multiterm Extract
May 2014
Tekom Belgium - © D. O. G. GmbH
83
Synchroterm
May 2014
Tekom Belgium - © D. O. G. GmbH
84
Some software packages…
Software
Description
acrolinx® Terminology Lifecycle
Management
Terminology extraction and management,
terminology platform
System for the extraction, management and
review of terminology.
Statistical analysis of text, free
Wordsmith: concordance and keywords
Monolingual terminology extraction with sentence
examples. Price value.
Monolingual terminology extraction, free
XML-based dictionary
Corpus statistics, concordance, free
Across crossTerm
AntConc3.2
Concord
Concordance
Extphr33
Heartsome Dictionary Editor
KWIC Concordance
KWICKWIC
Lookup
MultiTerm 2011 und MultiTerm Extract
qTerm
SDLX Phrase Finder
Simple Concordance Program (SCP)
Synchroterm
Terminology Extractor
TermStar
Textanz
Textstat
May 2014
Concordance tool
Internet-enabled platform terminology - purchase
+ rental software version
Offered by SDL / Trados. Integrated in Trados
technology. Multilingual.
Web-enabled system for managing terminology.
Terminology extraction with linguistic skills.
Available for seven languages​​.
Concordance and extraction program
(monolingual) - Freeware
Bilingual terminology extraction, semi-automatic,
efficient
Extraction of Monolingual terminology
Terminology module of Star Transit
Monolingual terminology extraction with context
specification
Statistical analysis of text, free
Tekom Belgium - © D. O. G. GmbH
Web
www.acrolinx.com
www.across.net
www.antlab.sci.waseda.ac.jp
www.lexically.net
www.concordancesoftware.co.uk
http://publish.uwo.ca/~craven/freeware.htm
www.heartsome.net
www.chs.nihonu.ac.jp/eng_dpt/tukamoto/kwic_e.html
www.kwickwic.com
www.dog-gmbh.de
www.sdl.com
www.kilgray.com
www.sdl.com
www.textworld.com
www.terminotix.com
www.chamblon.com/terminologyextractor.htm
www.star-ag.ch
www.cro-code.com
www.niederlandistik.fu-berlin.de/textstat/
85
ISO Standards

ISO Technical Committee (ISO/TC 37). Since 1951:






May 2014
ISO 704:2009 Terminology work – Principles and methods
ISO 1087:2000 Terminology work – Vocabulary – Part 1: Theory and
application
ISO 12620:1999 Computer applications in terminology - Data
Categories
ISO 12620:2009 is now an open category database
("Data Category Registry"): www.isocat.org
ISO 16642:2003 Computer applications in terminology –
Terminological markup framework
ISO 15188:2001 Project management guidelines for terminology
standardization
ISO/DIS 22274 Systems to manage terminology, knowledge and
content - Concept-related aspects for developing and
internationalizing classification systems
Tekom Belgium - © D. O. G. GmbH
86
Bibliography





Best Practices in der Terminologiearbeit. 2.0 Akten des
Symposions. Deutscher Terminologie-Tag e.V., Heidelberg,
(2014) Price: € 50
Cabré, M. Teresa. – Terminology : Theory, Methods and
Applications. – John Benjamins Publishing; 1999
COTSOES: Recommendations for Terminology Work. Bern;
2002 (can be downloaded: http://termcoord.files.wordpress.com/2013/07/here.pdf)
Sager, Juan C. - A Practical Course in Terminology
Processing. John Benjamins Publishing; 1990
Wright, S.E./Budin, G. comp. (1997/2001). Handbook of
Terminology Management. John Benjamins Publishing;
Volume 1: 1997; Volume 2: 2001
May 2014
Tekom Belgium - © D. O. G. GmbH
87
Thank you for your attention!
Dr. F. Massion
D. O. G. Dokumentation ohne Grenzen GmbH
Neue Ramtelstr. 12
71229 Leonberg, Germany
francois.massion@dog-gmbh.de
May 2014
Tekom Belgium - © D. O. G. GmbH
88