Development and evaluation of a urine protein expert system

Transcription

Development and evaluation of a urine protein expert system
Clinical Chemistry
42:8
1214-1222
(1996)
Development and evaluation of a urine protein
expert system
MIRosiv
IVANDIC,*
WALTER
HOFMANN,
Based
on the quantitative
determination
of creatmine,
total
albumin,
a1-microglobulin,
IgG, a2-macroglobuliii, and N-acetyl-I3,n-glucosaininidase
in urine in combination with a test strip screening,
the findings
of hematuria,
leukocyturia,
and proteinuria
can be assigned
to prerenal,
renal, or postrenal
causes. Using
this graded diagnostic
strategy as a knowledge
base, we developed
a computerbased
expert
system
for urine
protein
differentiation
(“UPES”)
as a decision-supporting
tool. The knowledge
base was implemented
as a combination
of “Wthen” rules
and two-step
bivariate
distance
classffication
of marker
proteins.
The knowledge
for this form of pattern recognition was derived from the results for a set of 267 patients
with clinically
and histologically
documented
nephropathies. To determine
the diagnostic
value of UPES,
we
tested another set of data: results for 129 urine analyses
from 94 patients. Using these data, the system reached 98%
concordance
with the clinical diagnoses
for the patients and
was superior to the diagnostic
interpretations
of four human experts. UPES has been successfully
integrated
into
the laboratory
routine process,
including
automated
data
import.
ing system
proteinuria
nephropathy
WALTER
G.
GUDER
or, better, knowledge-based
systems have been developed
and
are being used with increasing frequency.
Laboratory
medicine,
given its high degree of specialization
and its use of objective
protein,
INDEXING
and
quantitative findings, seems
especially suited to benefit from
these computer
programs [1, 2].
Here we describe such a decision-supporting system, the
Urine Protein Expert System (UPES), developed for the interpretation
of urine
protein
differentiation.’
As with electrophoretic techniques
[3-5], quantitative
analysis
of urine marker
proteins has been successfully applied to detect and differentiate
nephropathies
[5-7]. The multivariate
evaluation
of the excretion pattern
allows differentiation
of prerenal from glomerular,
tubular, and
postrenal causes of proteinuria and
hematuria
[8-11].
Knowledge
for describing and interpreting complex
urine
protein patterns has accumulated
in recent years, a result of
collaboration
between nephrologists
and clinical chemists. We
have tried to implement
this knowledge in the form of “if/then”
rules in the knowledge
base of UPES,
a knowledge
base that
contains facts and strategies
drawn from literature
as well as
from heuristics and empirical guidelines.
The rules have been
worked out in close collaboration
with specialists in the field of
urine protein differentiation.
Because various nephropathies
could not be sufficiently
identified by interpretation
of excretion patterns when based on
rules alone, we have used another method of knowledge
representation, geometric
distance classification,
to extract and apply
the knowledge
of this multivariate
pattern recognition.
Using
this hybrid model of a knowledge base, UPES is able to process
the laboratory
results provided and to propose a medical report
generated from 36 text elements. Twenty-four
of those elements
(all the ones used in this paper) are listed in the Appendix.
knowledge-based
system . decision-supportalbumin . a,-microglobulin
#{149}
a2-macroglobulin
#{149}
kidney diseases #{149}
hematuria
#{149}
leukocyturia
TERMS:
#{149}
Continuously
changing medical knowledge
has resulted in increasing specialization
in medicine.
Providing
optimal medical
care requires
experts
who can keep up with the enormous
information
flow; however, such experts are not always available.
To conserve
the knowledge
of a specialist
and to widely
distribute this knowledge,
software tools called expert systems
Matenais and Methods
Analytical procedures. Test strip screening
was performed
with
test strips from Behring
(Marburg,
Germany).
Quantitative
determinations
of total protein,
albumin,
ce,-microglobulmn,
IgG, a2-macroglobulmn
(turbidimetrically),
N-acetyi-/3,n-glu-
Institut f#{252}r
Klinische Chemie, St#{228}dt.
Krankenhaus Munchen-Bogenhausen,
Englschalkinger Str. 77, D-81925 Munchen, Germany.
Author for correspondence.
Fax +49 89 9270 2113; e-mail wguho@pclabor.uni-bremen.de.
Dedicated to H. Keller of ZUrich (Switzerland), on the occasion of his 70th
birthday. This paper contains part of the results of the doctoral thesis of MI.
Received November 7, 1995; accepted April I, 1996.
Nonstandard abbreviations: UPES, Urine Protein Expert System; /3-NAG,
N-acervl-(3,o-glucosaminidase;
and GFR, glomerular filtration rate.
1214
1215
Clinical
Chemisliy 42, No. 8, 1996
cosaminidase
(p-NAG)
(kinetically
and photometrically),
and
creatinine
in urine as well as serum concentrations
of creatinine
and a1 -microglobulin
were performed
as described
elsewhere
[12]. The reference
values used are from previous
publications
[7, 9].
Hardware and software. The knowledge-based
system for urine
protein analysis was developed by using an IBM-compatible
PC
(80386 CPU) with 1 MB RAM and DOS. A Turbo C Compiler
(V. 2.0; Borland, Munich, Germany)
and a BGI Printer Toolkit
(Ryle Design, Mt. Pleasant,
MI) served as programing
tools.
The statistical
software
package
SAS (V. 6.10; SAS Institute,
Cary, NC) was used to perform discriminant
analysis.
Geometric distance
classification.
Geometric
distance classification
is a method
for describing
and separating
multidimensional
pattern classes. Patterns are defined by complete quantitative
or
qualitative data. Patterns of distinguishable
classes form distinct
clusters in multidimensional
spaces. In geometric
distance classification,
groups of geometric
figures such as spheroids
and
ellipsoids are used to represent
these clusters (Fig. 1).
The distance
classifier
GEODICLA
[13] was developed
separately
to determine
the position and size of such spheroids
and ellipsoids automatically.
The program selects random members from each class from a training set of typical examples and
defines their geometric
“region of influence” [14]. This is done
by taking their coordinates
as the centers of the figures and
extending a user-defined
minimal radius until reaching either a
maximum
radius or the “nearest” example of a different class. If
an example
is picked that is already covered
by a figure
belonging
to the same class, then this example can be classified
already and does not need its own region of interest. When
every training
example
is covered,
the training
is stopped.
Reclassifying the training
set now always results in a 100%
classification
rate. The resulting
geometric
shapes can be
adapted manually after this automatic
“learning process.”
Using the software tool GEODICLA,
we have generalized
the information
contained
in the urine protein patterns of two
training sets and condensed
them into two sets of figures: circles
and ellipses. To classify an unknown urine pattern, we compare
it with these representatives
sets in UPES:
The geometric
distance from this pattern to the centers of each of the circles!
ellipses is calculated
and compared
with the radius of each
circle/ellipse.
If the pattern
lies within a circle/ellipse,
the
associated class is stored. Comparison
of the stored classes leads
UPES to its diagnostic conclusion.
Training sets. Protein
patterns of 503 second morning
urines
from 267 patients with clinically or histologically
diagnosed
nephropathies
were used to train the distance classifier GEODICLA.
Medical
The urines were collected
Department
of the Hospital
from patients of the II.
Munchen-Harlaching
and
of the III. Medical Department
and the Department
of Neurosurgery of the Hospital Munchen-Bogenhausen.
Depending
on
their clinical diagnoses, patients were assigned to the following
diagnostic groups:
primary glomerulopathy-different
forms of glomerulonephritis,
histologically
secondary
diagnoses
interstitial
thy, chronic
umented
documented
glomerulopathy-
diabetic
nephropathies,
clinical
nephropathy-e.g.,
acute tubulo-toxic
nephropainterstitial
nephropathy,
partly histologically
doc-
diagnoses
renal dysfunction-protein
excretion patterns ranging from
normal values to as much as twice the upper reference limit from
patients
from any of these three diagnostic
groups
that were not histologically
documented
were
based on clinical criteria (e.g., anamnesis,
clinical examination,
laboratory results,medical imaging, clinicalcourse) and made by
Diagnoses
the physician
treating
the patient.
composition
of the training set.
Table
1 summarizes
the
Validation
set.To evaluate the diagnostic interpretation
of urine
protein patterns, we used data from 129 urine analyses. These
test data were collected
from 94 patients of the II. Medical
Department
of the Hospital
Munchen-Harlaching
and the III.
Medical Department
of the Hospital
Munchen-Bogenhausen.
As in the training set, the urines were assigned to the diagnostic
groups
primary
glomerulopathy,
secondary
glomerulopathy,
and interstitial nephropathy,
according to their diagnoses (Table
1).
Discriminant analysis. To compare the diagnostic performance
of
the distance
classifier with the performance
of a statistical
method, we performed
classificatory
linear and quadratic
discriminant
analysis. We used the training set to compute
the
parameters
(coefficients
and constants)
of the linear and quadratic functions.
Equal prior probabilities
were assumed for all
four diagnostic groups. The same validation data were used to
evaluate the results of discriminant
analysis as were used with
geometric
distance classification
(Table 1).
#{149}
#{149}#{149}S
#{149}
a.
#{149}
#{149}
#{149}#{149}.#{149}
a
#{149}%
#{149}#{149}.
#{149}
a.
#{149}S
.
#{149}SS
U
at
#{149}
#{149} U
#{149}I.
.R
#{149}
Fig. 1. Use of circles to describe clusters
of two
different classes: (A) individual examples of two different classes forming two distinct clusters; (B) an
#{149}
#{149}1
U...
#{149}a
U #{149}#{149}#{149}
#{149}
optimal characterization of the clusters by using six
circles
B.
C.
(GEODICLA;
see text);
(C) the
representatives of the two classes.
resulting
six
Ivandi#{233}
et al.: Urine
1216
protein
expert
system
(UPES)
Table 1. Compos ition of the train ing and the valid ation collective.
Training collective
Urine
Diagnostic
samples
Patients
groups
n
Primary glomerulopathy
285
57
Secondary
glomerulopathy
123
Interstitial
nephropathy
66
Renal dysfunction
Total
29
503
%
n
%
n
%
44
46
36
27
29
24
97
36
76
59
62
66
13
33
12
7
5
6
20
267
and glucose.
As an option, the
by providing
glomerular filtrationrate (GFR) can be considered
the data for serum
Apart
from
creatinine
and serum
the results of the serum
a,-microglobulin.
and
urine analysis,
additional
data concerning
the patient and the request of the
urine protein differentiation
data can be entered into UPES by
using an input screen or can be imported
automatically
by
retrieving
a file.
KNOWLEDGE
7
-
5
-
EASE
The knowledge
base of UPES is divided into five modulesPlausibility
and consistency
check, Hematuria,
Leukocyturia,
Proteinuria,
and GFR-which
are considered
if necessary. The
implemented
strategy is represented
as if/then rules. The geometric distance classification
is used only in the Proteinuria
module to interpret the marker protein patterns.
129
-
94
(see next section),
Medicalassessment
ofGFR. The GFR module is considered only
if the concentrations
of the optional serum analytes creatinine
and a,-microglobulin
are provided.
a,-Microglobulin
partially
fills the diagnostic gap associated with creatinine,
by sometimes
detecting
a decrease of GFR earlier than creatinine
does [15-
17].
A major
serum
I; see
Appendix).The GFR is assumed to be decreased if concentrations of both analytes are increased (text element 2). In combination with a normal urine excretion pattern, this is interpreted
analytes
restriction
are within
of the GFR
their
reference
as a lossof functioning nephrons
is unlikely
ranges
(text
if both
element
that iscompletely compensated
by the remaining
nephrons
(text element 3).
An increase of only a,-microglobulin
in serum indicates a
possible restriction
in glomerular
clearance (diagnostic
gap of
creatinine).
In this case, determination
of creatinine clearance is
recommended
to confirm
or to exclude this suspicion
(text
element
4). If only creatinine
is increased,
this more likely
indicates the presence of pseudocreatinines
or increased muscle
mass (text element 5), given the greater diagnostic sensitivity of
a, -microglobulin.
Medical assessment of hematuria. Whenever
the test strip result for
blood is positive,
the Hematuria
module
is considered,
to
distinguish
Plausibility and consistemy check. All data are checked for plausibility during the input or import process; formats and thresholds
are used to exclude values that exceed medical and analytical
ranges. For analytical
validation,
this module considers
the
values for total protein, albumin, test strip protein, and the two
serum
measurements
(creatinine
and a,-microglobulin).
A
warning appears on the screen (“Discrepancy
between test strip,
albumin, and total protein!”) if the comparison
of the test strip
result and the quantitative
measurements
fulfills one of the
following conditions:
protein test strip positive and total protein 200 mg/L
protein test strip negative
and albumin >300 mg/L
albumin > total protein and albumin
>50 mg/L
These rules take into account that the detection limit of the test
strip is -300 mg/L albumin
and thus detect
false-positive
and
false-negative
test strip results.
If the value for urine protein excretion is normal and one of
5
-
the serum values indicates a decreased GFR
the user is asked to check the input data.
For interpretation
of a urine protein pattern, UPES requires at
least the data for urine creatinine,
total protein, albumin, and
a,-microglobulin.
For differential diagnosis during the decision
process, the system asks for data on IgG, a2-macroglobulin,
and
p-NAG if necessary. The program refers all quantitative
measurements
to the urine creatinine
content to take into account
the concentration
of the urine sample [7]. These quantitative
data are processed
together
with the results of the urine test
strips for assessing leukocytes (granulocyte
esterase), hemogloprotein,
Patients
n
DATA
bin (pseudoperoxidase),
Urine sampies
117
Results
INPUT
Validation collective
prerenal
from
glomerular,
tubular, and
postrenal
causes.
Prerenal
causes of the test strip result are assumed if the
criteria for prerenal proteinuria
are met (i.e., a “protein gap”; see
text element
6) [18]. If albumin
excretion
is <100 mg/L,
differentiationof renal and postrenal hematuria by urine protein
analysis is not possible [9]. In such cases, UPES suggests using
phase-contrast
microscopy
to look for dysmorphic
ervthrocytes
[19, 20] (text element 7).
At higher albumin concentrations,the system considersthe
ratios
of albumin
with a2-macroglobulin,
IgG, and a,-microglobulin
to assign the hematuria
to a renal (glomerular
or
tubulo-interstitial)or postrenal bleeding [9, 10]. If a2-macroglobulin and lgG results have not yet been provided, UPES asks
for theirmanual input.
Because of their molecular
size, only small amounts
of
a,-macroglobulin (250 kDa) and IgG (125 kDa) usually pass the
glomerular filter,and those are reabsorbed in the tubule. When
Clinical
Cbemisy
albumin ratios with these proteins in urine are similar to those
in plasma, therefore,
a postrenal lesion is indicated (a2-macroglobulinlalbumin
>0.02 and IgG/albumin
>0.2). In this case,
the system proposes that the clinician repeat the urine protein
differentiation to exclude
additional renal hematuria
Medical assessment of leukoyturia.
The Leukocyturia
module is
considered
whenever
the leukocyte esteraseteststripshows a
positive result. An isolated leukocyturia
in combination
with a
normal urineproteinpatternindicateseithera contamination of
the urine sample or an inflammation of the lower urinarytract
(textelement 10).Leukocyturia with a slightglomerular proteinuria(totalprotein <150 mg/g creatinine,albumin <100
creatinine, a,-microglobulin
<14
mg/g
creatinine) can
have both renaland postrenalcauses(textelement 11),whereas
substantial glomerular
involvement
or tubular proteinuria indicates renal
in an inflammatory
process
(text element
12).
Medical assessment of proteinuria. In contrastto the previoustwo
modules, the Proteinuria
module is used in all cases to interpret
the various urine protein ratios.Active renal diseasecan be
excluded ifteststrip results are negative and the concentrations
of urine totalprotein,albumin, and a,-microgbobulinarewithin
their reference
ranges (text element
13). Normal excretion of
both marker proteinsbut increasedIgG in urine may indicate
(e.g.)monoclonal gammopathies (textelement 14).
If totalprotein excretion is >300 mg/L and the sum of
albumin,
protein
a1-microglobulin,
excretion,
prerenal
and
IgG
is <30%
causes
such
as Bence
of the total
Jones
protein-
uria might account for this disproportion
[18]. ImmunofIxation
is suggested
for further confirmation.
This finding initiates
temporary
report (text element 15), and the decision process
stopped.
Renal
described
proteinuria
and
assigned
can
be
quantitatively
to different
kinds
and
a
is
qualitatively
of nephropathies
1217
Table 2. DescrIption of tubular and glomerular proteinurla
according to the excretion of the marker proteins albumin
and a1-microglobulin.
a1.Microglobuiin
Albumin
after
postrenal
hematuria
has ceased (text element 8).
In renalhematuria (a2-macroglobulinlalbumin<0.02), gbmerular and tubulo-interstitial
causes can be distinguished
by
the concentrations
of IgG: In tubular hematuria,
even small
amounts
of filtered IgG cannot be reabsorbed
(IgG/albumin
>0.2). Increased
excretion
of the tubular marker a,-microglobulin
is taken as additional
confirmation
of the tubulointerstitial
lesion (text element 9).
mg/g
42, No. 8, 1996
by
analysis of the excretion pattern of albumin, a,-microglobulin,
and IgG. Using albumin as a glomerular marker and a1microglobulin
as a tubular marker, UPES describes the extent of
glomerular and tubularproteinuriaas borderline,slight,
significant, distinct, and nephrotic,
according to the thresholds
given
in Table 2. The IgG/albumin
ratio helps to distinguish
“selective” (<0.03) from “nonselective”
(>0.03) proteinuria
in gbmerulopathies
with albuminuria
>500 mg/g creatinine.
An
example of a description
of a renal proteinuria
is text element 16
(albumin
1100 mg/g creatinine,
a,-microgbobulin
25 mg/g
creatinine,
IgG 15 mg/g creatinine).
Apart from thisquantitativeand qualitativedescription,a
renal proteinuria
can also be assigned to different
diagnostic
classes of renal diseases.
mg/g
14-20
20-50
50-100
>100
Description
creatinine
20-30
Borderline
30-100
Slight
100-1000
1000-3000
>3000
Significant
Distinct
Njephrotic
The training sets of patients with clinically or histologically
documented
diagnoses show that tububo-interstitial
nephropathies and primary
and secondary
glomerubopathies
are each
characterized
by a specific urine protein pattern. The clusters of
these disease groups can be defined and separated
in logarithmic coordinates,
with the marker proteins albumin and a,-microglobulin
making up the x- and y-axes, respectively
(Fig. 2,
top).
A renalproteinuriawith albumin <40 mg/g creatinineand
a,-microgbobulin
<28 mg/g creatinine
cannot be clearly assigned to only one of the three disease classes because of the
overlapping
zones of the clusters. Such a slight proteinuria
can
be interpretedas “renaldysfunction,”which can have renaland
extrarenal causes: e.g., metabolic disorder, fever, intense physical exercise (see text element
17). In the overlapping
zone
between primary
and secondary
glomerubopathies
as well as
interstitial
nephropathies,
further diagnostic
information
might
be achieved by takingIgG intoconsideration(Fig.2, bottom).
An excretion
pattern
from an unknown
patient
can be
assigned to any of the diagnostic
groups by comparing
it with
the position of the different clusters in both coordinates.
To implement thisvisualclassification
in the knowledgebased system UPES, we used the geometric
distance classifier
GEODICLA
[13]. After five fictitious patterns had been added
to the training samples to detect implausible
marker constellations (Fig. 2, top), and the learning and abstracting
process of
GEODICLA
had been performed,
the information
contained in
these 508 single urine protein patterns of the training sets was
specifically condensed
into some representative
examples: The
clusters of the different
classes were now described
with 60
circles (albumin-a,-microglobulin
patterns) and 15 ellipses (albumin-IgG
patterns).
For diagnostic
interpretation
of urine findings of an unknown patient, UPES calculates the logarithm
of the patient’s
concentrations
of albumin and a1-microglobulin.
The geometric distance of this pattern to the centers of each of the 60 circles
is calculated
and compared
with their radii. If the pattern lies
within a circle, the corresponding
class is stored.
According
to the classes stored after this first classification,
the system identifies
the excretionpatternas belonging with one
of the following
diagnostic
groups:
renal dysfunction
(text
element 17),primary glomerubopathy (textelement 18),secondary glomerubopathy
(element
18), primary
or secondary
gbmerulopathy (text element 19),tubulo-interstitial
nephropathy
1218
Ivandi#{233}
et al.: Urine
protein
expert
system
(UPES)
A
4
#{163}
A
5o
A
aALA
A
A
.#{149}
Si
AL
LLj
0
A
#{149} a,
0
jA
#{149}
Is
#{149}.
A
x
10
S
DOD
S
#{149}
.
.
0
iwo
10
ic
Albumin (mglg creatinine)
iww-
a
C
iwo
#{149}
C
#{163}
a
a
a
a
SAL
U
E
0
10
Fig. 2. Albumin-a1-microglobulin
patterns
(top) and albumin-lgG patterns (bottom)
of the diagnostic
classesof the training
a
A
a
#{163}
collective:
S
thies
10
ito
Albumin
iwo
a,-microglobulin
-
primary
nephropa-
() and secondary(#{149})
glomerulopathies,
the dysfunctional
collective (*) and the plausibility collective
(#{149}).
(mglg creatinine)
(element
18), glomerulopathy
with interstitial
involvement
or
interstitial
nephropathy
with secondary
glomerulopathy
(text
element 20),or implausiblemarker proteinpattern.
Only ifthisfirst
classification
based on the a,-microgbobulinl
albumin ratiorevealsan ambiguous diagnosis(elements 19 and
20) does UPES consider IgG in a second step: One of a pair of
diagnoses
in an ambiguous
diagnosis is more probable
if the
albumin/IgG
pattern of the patient is covered by ellipses of only
one class(unambiguous classification;
text element 21). This
two-step pattern identification
in UPES reflects that the diagnostic discrimatory
power of IgG is less than that of a1microgbobulin.
Depending
on the result of the two-step classification,
additional rules are considered.
If the diagnostic
pattern classification reveals a glomerulopathy,
the system takes into account the
possibility
that the tubular component of a proteinuriamight
resultfrom tubularoverload caused by an excessiveglomerular
proteinuria.
In nephrotic
proteinuria
(albumin
>3 g/g creatinine),therefore,the extentof the tubularshare iscorrectedby
using the following equation, derived from urine findingsin
selectedpatientswith glomerulonephritiswhose renalinterstitialspace was devoid of major histopathological
findings[21]:
a1-microgbobulin (corr.)
=
e#{176}#{176}#{176}#{176}22
. .,lbumin
tubulo-interstitial
(A),
4.7
This
the
equation
cluster
estimation
approximately
of primary
of the
describes
gbomerulopathies
amount
the
lower
in Fig.
of tububo-interstitial
margin
2 and
of
allows
involvement
in
gbomerular diseases:Tubular proteinuriais assumed to result
from tubularoverloadifthe correctedvalueof a1-microglobulin
is <14 mg/g
concentrations
creatinine
(text element
22). a,-Microgbobulin
>14 mg/g creatinine
are interpreted
as showing
an involvement
of the renal interstitialspace in gbomerulone-
phritis (text element 23).
To differentiate
acute from chronic tubular disorders in
interstitial
nephropathies,
UPES
requests
data for the catalytic
concentrationof the tubular enzyme p-NAG. In acute lesions
(e.g., caused
by nephrotoxic
antibiotics),
the excretion of
p-NAG usuallyexceeds20 U/L ifa1-microgbobulinexcretionis
>40 mg/g creatinine
(text element 24) [12]. Chronic
tububointerstitial
excretion
OUTPUT
diseases
without
(FINAL
are described
a major
by increased
increase
a,-microgbobulin
of p-NAG.
REPORT)
UPES composes the finalreport from the selectedtextitems
afterthe urine and serum proteinfindingshave been medically
assessed.
Clinical
Chemistry 42, No. 8, 1996
EVALUATING
THE
VALIDATION
SET
KNOWLEDGE
BASE WITH
1219
most (46, or 61%) of the 76 urines to the diagnosticgroup
“primary or secondary glomerulopathy.”
The excretion ratio for
IgG/albumin
misled the system in 3 of these 46 decisions to
favor the primary type of glomerulopathy. The remaining 27
urines (36%) were classified as “renal dysfunction”
because of
the low quantitiesof marker proteinsexcreted.
Finally, UPES interpreted
the urine patterns of all 7 interstitial nephropathies
correctly.
THE
To compare the medical interpretation
of proteinuria
by UPES,
statistical methods and human expertise, we assessed the results
of urine proteindifferentiation
of the validationset(129 urines
from 94 patients)as classified
by IJPES, linear and quadratic
discrimination
fimctions, and four experts in our laboratory who
were familiarwith thismethod of urine analysis.
The resultsof
these evaluationsare given in Table 3; misclassifIcations
are
summarized
as “others.”
Because there are no gold standards
for evaluating
urinary
protein patterns,it was difficult to define correct and false
interpretations.
Patients with a documented diabeticnephropa-
Discriminant functions. As an alternative classification
method, we
used the discriminantfunctionsestimated from the albumin,
excretion. The patterns were describedby human experts and by
the system as reflecting
gbomerular and (or)tubulardysfunction,
secondary
gbomerulopathy,
primary or secondary gbomerubopa-
a,-microglobulin,
and IgG patterns of the trainingset (no
implausible
constellations
were included). Each protein pattern
was classified to the diagnostic
group having the highest group
probability,
as computed with linear and quadratic discriminant
functions. Resubstitution
of the trainingsetresultedin a reclassification
rateof 75% by lineardiscriminantfunctionsand 79%
thy, or mixed
by quadratic discriminant functions.
thy, for example,
showed
many
different patterns of protein
(gbomerubar and tubular) nephropathy,
and all of
these diagnostic
groups were assumed to be a correct interpretation. Only the description
“primary gbomerubopathy”
would
be judged a clearmisclassification
of these patients.
UPES. Of 46 urines from patients
with gbomerubonephritis,
UPES identified 9 primary gbomerubopathies
(20%) by first-step
classification.
The correct but more global diagnosis “primary
or secondary glomerulopathy” was chosen in the majority of
cases(31 of 46 urines,67%) because of the overlappingzones of
the albuminla,-microgbobulin
patterns.
Using the IgG excretion in a second-stage
pattern classification
correctly assigned 6
of these 31 ambiguous
cases to the primary gbomerulopathy
group. Two patientswith gbomerubonephritisand albuminuria
>10 g/g creatininecould not be interpretedby UPES.
Only 2 of 76 urines(3%) with secondary glomerulopathies
were misclassified
as primary glomerulopathies
by UPES. Both
of these urines showed substantial
albuminuria
(844 and 552
mg/g creatinine)
and IgG excretion (63 and 59 mg/g creatinine)
but no significant
tubularproteinuria. Again, UPES assigned
To allow consideration
of an ambiguous
classification,
as in
UPES, we took into account the differencebetween the two
highest group probabilities.
If this differencewas <0.3, the
pattern was assigned to both classes(ambiguous classification).
Linear discriminant
functions described 38 samples (29%) of
the validationsetas caused by “renaldysfunction”;68 patterns
(53%) were interpretedcorrectlyas belonging to other diagnostic groups matching the known diagnosis. By quadratic discriminant functions,37 cases(29%) were classified
as “renaldysfunction,” whereas other, correct diagnostic classes were chosen for
65 samples (50%). In total,there were 23(18%) vs 27 (21%)
misclassifications
by linear and quadratic discriminant
functions,
respectively
(Table 3).
Human
greatly,
experts. The quality of the human expertisevaried
depending
on the experience
of each expert with urine
protein differentiation.
Generally,
the humans interpreted
more
proteinuriasas being “renaldysfunction”than did UPES. Two
experts
more oftendecided on an unambiguous diagnosis,atthe
Table 3. Diagnostic interpretation of urine protein differentiatIons of 46 prImary glomerulopathles,
76 secondary
glomerulopathies,
and 7 interstitIal nephropathies by UPES, linear and quadratic discrimlnant functions, and four experts.
Clinical diagnosis
Secondary GP
PrimaryGP
Expertise
UPES
Expert
Expert
Expert
Expert
LDF
QDF
a
1
2
3
4
Correct”
TP
Pulm.GP
GP
GP/TP
Dys
Others
GP
GP/TP
Dys
TP
GP/TP
Dys
Others
9
31
1
3
2”
0
46
1
27
2”
6
0
1
0
2
19
0
22
22
22
41
18
38
8
12
12
1
6
5
9
1
0
2
3
3
5
5
2
0
0
0
2
6
10
0
0
11
5
11
13
43
32
30
17
15
11
0
0
1
9
0
0
32
34
32
39
33
35
1
10
2
6
17
17
5
5
6
5
7
5
0
0
0
2
0
2
1
0
1
0
0
0
1
2
1
0
0
0
diagnoses are listed (prim./sec.) GP
=
Sec.GP
(primary/secondary) glomerulopathy,
TP
=
tubulo-interstitial
Others
nephropathy,
patterns interpreted as implausible constellations and urines not classified or misclassified are summarizedas Others.
b 2 patterns not classified by UPES.
“2
patterns
classified
LDF, linear discriminated
by UPES as a primary glomerulopathy.
function;
QDF, quadratic
discriminant
function.
or Dys
=
renal dysfunction,
whereas
Ivandi#{233}
et al.: Urine
1220
risk of increasing
preferred
the
their misclassifications;
more
general
diagnosis
the other
“primary
protein
two experts
or
secondary
glomerulopathy,”
to be on the safe side. Notably,
one expert
reliedon a positiveglucoseteststripresultto classify a glomerubopathy as the secondary type. In contrast to UPES, he and two
other expertsfailedto identifythe primary gbomerubopathy in a
32-year-old
woman with IgA nephropathy
and familial glucosuria.
Discussion
Evaluation
with the validation data set showed that noninvasive
urine proteindifferentiation
may be a usefuldiagnosticstrategy
in nephrology.
The knowledge-based
system UPES performed
well in diagnostic
interpretation
of urine protein
patterns,
correctly distinguishing
all interstitial
nephropathies
from gbmerulopathies.
It misclassified
only 2 of 129 urines (2%),
incorrectlyconcluding that patternsof significant
glomerular
proteinuria
had instead indicated a primary gbomerubopathy.
Discriminant functionswere not able to deal properly with
the overlapping zones of allclinicalclasses.
The four human
experts also had problems
correctly
classifying
primary
and
secondary gbomerulopathies-which
are difficult to distinguish
by clinical
chemistry means.
After the evaluation,we adjusted the knowledge base of
UPES to improve the medical assessment. We added one circle
to the secondary
glomerulopathy
class so that this diagnostic
group would be considered
in cases of significant
glomerular
proteinuria.Another circlewas also added to the primary
gbomerubopathy
class to ascertain the identification
of cases of
excessive proteinuria.
The addition of these two circles will help
prevent misclassification
in similarcases.
Knowledge-based
systems, as means of rationalization,
accelerate the time-consuming
process of medical assessment
and
increase the economic
efficiency of a clinical laboratory.
Such
programs
make possible consistent
and standardized
medical
assessmentof constantand high quality,
especially
when dealing
with the highly complex data produced
in increasingly
specialized areas [22-24]. Apart from learning effects, transparent
data
interpretation
rather
than simple
“data intoxication”
[25] may
provide clinical physicians
with useful additional
information
[26].
The knowledge-based
system we designed provides for the
first time a concise decision-supporting
system to exclude and
differentiate
proteinuria,
hematuria,
and leukocyturia.
Working
with the complex excretion pattern of different marker proteins,
UPES can distinguish
prerenal, glomerular,
tubulo-interstitial,
and postrenal causes of pathological
urine findings. By using two
differentmethods for knowledge representation,
we essentially
implemented the strategyand experienceof a specialist
in urine
protein differentiation
as a knowledge
base.
Modelling
the framework of the knowledge base with if/then
rules makes itpossibleto integratethe heuristicsthat guide a
human
expert in the diagnosticdecisionprocess.Rules allowthe
designof a modular knowledge baseto maintain a clearstructure
and facilitate regular update. Furthermore,
the user can easily
retrace the decisions
formulated
by the system. Diagnostic
pattern
classification
in urine protein
differentiation
can be
expert
system
(UPES)
implemented
in a rule bae by using constant
thresholds
to
describe the different clusters by squares bike a mosaic. However,good resolutionforsufficient
representationof the clusters
isobtained only by using a largenumber of thresholds.Thus,
the quality of classification
is limited by the number of rules
needed to compare the patternof the patientwith the margins
of allclusters.
Although thisiseasilydone in a two-dimensional
pattern recognition,
more dimensions
increase the number and
complexity of rules exponentially.
Because rules, therefore,
did not appear to be the optimal
solution,
we booked foralternative
ways of knowledge representation. Classificatory
discriminant
analysis [27], for example, can
designate
and separate
the different
diagnostic
groups in a
statistical way, but several assumptions
are necessary that are not
always met (e.g.,mubtivariatenormal distributionof data).
Moreover, a largersetof examples of allclassesisnecessaryto
finddiscriminatingfunctionsthatare generallyvalid,and every
change in this collective
(e.g., adding a new patient not yet
correctly classified) requires a complete
recalculation.
Nevertheless,we used the trainingset of urine protein patternsto
estimateassociatedlinearand quadraticdiscriminantfunctions.
Using these functionsto classifythe validationset,however,
revealedmajor difficulties in dealingwith overlappingzones of
the diagnostic groups.
Another
flexible
tool
used
successfully
in laboratory
medicine
forrobustpatternrecognitionisneuralnetworks [28-30].These
models forknowledge-processingand representationare abbeto
deal with complex, uncertain,
and even incomplete
data. In a
self-organizing
process they use the information
contained
in
training data to build up and adjust their “knowledge.”
After this
dynamic learning process, the adapted network structure itself
incorporates
the knowledge
base
[31].
Quasi-parallel
processing
of data enables fastclassification
in neural networks but also
makes it difficult
for users to influenceand understand their
behavior. The successful training of these “black boxes” depends
on theirarchitecture(i.e.,
number of neurons and layers).
The
lack of general rules for constructionmeans that finding the
rightconfigurationof a neuralnetwork requiresmuch empirical
testing.
In designing UPES, we chose another way to simulate the
diagnostic identification
of marker patterns as an important part
of the expert’s considerations.
Geometric
distance classification
allows the system to recognize and separate quantitative
multidimensional
patterns [14]. Implemented
in the flexible software
tool GEODICLA
113], thisclassification
method can describe
and separatecomplex clustersin terms of spheroidsand ellipsoidswith straightand obliqueaxes.Information contained in a
trainingsetisspecifically
integratedand generalizedin a rapid
automated learningprocess.In contrastto most neuralnetworks
and statistical
classification
methods, the resultingrepresentativesalways guarantee a 100% reclassification
ratewhen cbassifying the training
collective. Multiple
features such as mathematical preprocessing,
several learning
modes, and different
ways of distance calculation
help influence the self-organizing
process and optimize its results. The geometric figures and their
parameters
can easily be adjusted and extended. A simple local
modification
of the geometric knowledge
base, e.g., if a pattern
1221
Clinical
Chemistry 42, No. 8, 1996
isnot yet correctlyclassified
or isnot classified
at all,adds new
knowledge
to a classification
system. Updates of the knowledge
base are thus facilitated.
Geometric
distance
classification
enables UPES to make
robust and nonparamen-ic
pattern
recognitions.
Further,
the
diagnostic
classification
can be elucidated
by showing the circles/ellipses
on the screen together with a symbol representing
the patient’spattern.Thus, a user does not have to accept the
UPES interpretation
of the marker proteins as if it were a Greek
oracle.
The quality of a decision-supporting
system for daily routine
assessment such as UPES depends on easy and comfortable use
of the system
as well as the knowledge
integrated
being highly
accurate and sufficiently
extensive. Because widespread
use of
the system depends on its acceptanceby users,considerationsof
comfort and safety have played a major role in its development.
The complete integration
of the knowledge-based
system in the
computer
network
structure
of our laboratoryas webb as the
automated
data import and export minimizes errors during data
transfer and contributes
greatly to the comfortable
and problem-free
use of UPES in daily routine. Use of programming
language C guaranteesthatmedical evaluationof the dataisnot
time consuming: UPES takes2 s to compose the reports from a
filecontaining data for 30 patients(the estimated average
number of dailyrequests),
using a Model 486 PC (33 MHz).
Actually, >90% of the reports created by UPES are not
modified. In the remaining cases, additional clinical information
(e.g.,
known renaltransplant)isconsidered.
Apart from these practicalaspects,the credibility
and reliability of a decisionby implemented knowledge are an essential
condition
for the widespread use of an expertsystem,especially
in medical fields.
Evaluation with the validationset confirmed
thatinterpretation
of urine proteindifferentiation
isa complex
and difficult task sufficiently
solved by UPES. Moreover,
the
evaluation results provided evidence that even experts can learn
from
a continually
growing
knowledge
base
of an expert
system.
Given that gold standards have yet to be defined for many of the
observed protein patterns (e.g., “dysfunction”),
future prospective studies may help improve the predictive
qualities of the
system. Consideration
of additional clinical information,
implementation
of other urine results (e.g., microscopy),
and extension to previous urine protein patterns are currentlyunder
development.
We conclude that urine protein differentiation
in itspresent
form issuperior to traditional
urine analysis as a mirror of renal
function [32] and is a valuableadditionto the morphological
information
provided by histopathobogy
and medical imaging.
Use of the decision-supportingsystem UPES for medical assessment of urine proteindifferentiation
providesa standard of
high and constant quality. A graduated and transparent
decision
process is implemented
in a hybrid knowledge
base that uses
both production
rules and geometric
distance classification
as
complementary methods of knowledge representation.In the
hands of a responsible
physician, UPES can be a useful tool for
increasing
the efficiencyand qualityof a laboratory.
References
1. Spackman KA, Conrrelly OP. Knowledge-based systems in laboratory medicine and pathology. Arch Pathol Lab Med 1987;111:
116-9.
2. Winkel P. The application of expert systems in the clinical laboratory. din Chem 1989:35:1595-600.
3. PesceAi, Boreisha I, Pollak VE. Rapid differentiation of glomerular
and tubular proteinuria by sodium dodecyl sulfate polyacrylamide
gel electrophoresis.
Clin Chim Acta 1972;40:27-34.
4. Boesken WH, Kopf K, Schollmeyer P. Differentiation of proteinuria
diseases by diskelectrophoretic molecular weight analysis of
urinary proteins. Clin Nephrol 1973:1:311-6.
5. Petersen A, Evrin PE, Berggard I. Differentiation of glomerular,
tubular and normal proteinuria: determinations of urinary excretion
of $32-microglobulin, albumin and total protein. J Clin Invest
1969;48:1189-98.
6. Cameron JS,BlandfordG.The simple assessment of selectivity in
high proteinuria.
Lancet
1966;i:242.
7. Hofmann W, Guder WG. A diagnostic programme for quantitative
analysis
of proteinuria.
J CIin Chem Clin Biochem
1989:27:589-
600.
8. Hofmann W, Rossm#{252}ller
B, Guder WG, Edel HH. A new strategy for
characterizing proteinuria and haematuria from a single pattern of
definedproteins
in urine. EurJ ClinChem ClinBiochem 1992:30:
707-12.
9. Hofmann W, Schmidt 0, Guder WG, Edel H. Differentiation of
hematuria by quantitative determination of urinary marker proteins. Kim Wochenschr 1991;69:68-75.
10. Guder WG, Hofmann W. Differentiation of proteinuria and haematuna by single protein analysis in urine. Clirr Biochem 1993:26:
277-82.
11. Hofmann W, Sedlmeir-Hofmanrr C, lvandi M, Schmidt 0, Guder
WG, Edel H. Assessment
of clinically characterized
of urinary.protein
patterns
patients. Typical examples
on the basis
with reports.
Lab Med 1993;17:502-12.
12. Schmidt 0, Hofmann W, Guder WG. Adaptation of the diagnostic
strategy of urine protein differentiation to the Hitachi 911 analyzer. Lab Med 1995:19:153-61.
13.
lvandi#{227}
M. Entwicklung
und Evaluierung
eines
wissensbasierten
Befundungssystems zur Urineiwei8differenzierung [Dissertationl.
M#{252}nchen:
Ludwig-Maximilians-Universitat, 1995.
14. Barschdorff D, Bothe A. Signal classification using a new selforganising and fast converging neural network. Noise Vibration
1991;9:11-9.
15. Itoh Y, Enomoto H, Takagi K, Kawai T. Clinical usefulness of
serum
a1-microglobulin
as a sensitive
indicator
for renal insuffi-
ciency. Nephron 1983;33:69-70.
16. Weber MH, VerwiebeR.a1-Microglobulin
(protein
HC):featuresof
a promisingindicator
of proximaltubulardysfunction.
Eur J Clirr
Chem Clin Biochem 1992;30:683-91.
17. Jung M, Jung K. Low-molecular-mass proteins in serum as markers of glomerular filtration rate: cystatin C, a1-microglobulin and
p2mrogloi
Lab Med 1994;18:461-5.
18. Boege F,KoehlerB, LiebermannF.Identification
and quantification of Bence-Jones proteinuria by automated nephelometric
screening. J dIm Chem Clin Biochem 1990;28:37-42.
19. Birch
OF, Fairley KF. Haematuria:
1979;ii:845-6.
glomerular
or nonglomerular?
20. K#{227}hler
H, Wandel E, Brunck B. Acarrthocyturia-a
characteristic
marker for glomerular bleeding. Kidney Int 1991;40:115-20.
21. Hofmann
W. A mathematical
equation
to discriminate
overload
proteinuria from tubulo-interstitial involvement in glomerular diseases. Clin Nephrol 1995;44:28-31.
22. Keller H,Trendelenburg
C, eds.Clinical
biochemistry
data presentation and interpretation. Berlin: de Gruyter, 1989.
1222
Ivandi#{233}
et ab.:Urine proteinexpertsystem (UPES)
23. Bepperling C, Hehrmann R, Haas H, Hotz G, Olbricht T, Schmidt R,
et al. A knowledge-based system for the interpretation
ofthyroid
hormone measurements: evaluation and optimisation of the
system Pro.M.D.-SD in five clinical laboratories. Lab Med 1994;
18:564-71.
24. TrendelenburgC, PohI B. Pro.M.D. expert system and itsapplication in laboratory medicine.Ann Biol Clin 1993:51:226-7.
25. O’Moore RR. Decision-supporting based on laboratory data. Methods Inform Med 1988;27:187-90.
26. WmnkelP. Interpreting the results is the expertise of the laboratory.
Clin Chim Acta 1994;224:S9-S51.
27. SolbergHE. Discriminant
analysis.
CritRev dIm Lab Scm 1978;9:
209-42.
28. lvandk BT, Kratzer MA.A, Fateh-Moghadam A. Analysisof serum
electrophoresis pattern by artificial neural networks. Lab Med
1992:16:128-33.
29. Furlong JW, Dupuy ME, Heinsimer JA. Neural network analysis of
serial
cardiac
enzyme
data-a
clinical
application
of artificial
machine intelligence. Am J dIm Pathol 1991:96:134-41.
30. Reibnegger G, Weiss G, Wachter H. Self-organizing neural networks as a means of cluster analysis in clinical chemistry. Eur
J dIm Chem dIm Biochem 1993:31:311-7.
31. McClelland JL, Rumelhart DE. Explorations in parallel distributed
processing. Cambridge, MA: MIT Press, 1989.
32. Hofmann W, Regenbogen C, Edel H, Guder WG. Diagnostic
strategies in urinalysis. Kidney mt 1994:46(47,
Suppl):111S-4S.
Appendix: Text Elements Selected by UPESto Compose a
LahOFatOIY Report
1. Based on the serum findings,a major decrease of
glomerular
filtration rate is not likely.
2. The glomei-ular filtration rate is reduced.
3.An isolatedincreaseof both serum valuesin combination
with normal urine protein excretion
might reflect a boss of
functioning
nephrons.
The protein reabsorption
is fully compensated by the remaining
nephrons.
An active renal disease is
unlikely.
4.To exclude the possibility
of a reduced GFR, gbomerular
clearanceshould be investigated.
5. The isolatedincreaseof serum creatininemight indicate
so-called pseudocreatinines.Alternatively,increased muscle
mass or a meat diet may be involved.
6. Discrepancy between the sum of albumin, IgG, and
a, -microglobubin
and the concentration
of total protein (single
proteins/total
protein
<0.3) in combination
with a positive
test-strip result for blood indicates a prerenab hematuria.
Definite report follows after additional
tests to exclude myogbobinuriaor hemogbobinuria.
7. Differentiationof renal and postrenal hematuria by
protein analysisisimpossibleat albumin concentrations<100
mgfL. Phase-contrast
microscopy
of a fresh morning urine may
allow the differentiation
of renal and postrenal causes of hematuna (acanthocytes?).
8. Most likely,postrenab hematuria is present. Because
additionalrenal excretion of proteins cannot be excluded, a
control after disappearance of hematuria issuggested.
9. Most likely,
renal(glomerular/tububo-interstitial)
hemaispresent.A slightadditionalpostrenalsource of erythro-
tuna
cannot be excluded.
10. The detection
of beukocyte
cytes
possible postrenal
with leukocytes.
inflammation
esterase
or contamination
may
indicate
a
of the urine
11.The urineproteindifferentiation
should be repeatedafter
leukocyturia
has stopped, because inflammations
in the lower
urinary tract can also cause a slight proteinuria.
12. The detection
mation
with renal
of leukocyte
involvement,
esterase may indicate inflamif there was no leukocyte
contamination
during sampling.
13. Analyses of the marker proteins in the urine do not
indicateany dysfunction of gbomerular protein filtration
and
tubular reabsorption.
No signs of hematuria
or granulocyturia
are present.
14.An isolatedincreaseof IgG may indicate,e.g.,monocbonab gammopathies.
15. Discrepancy
between
the sum of albumin,
IgG, and
a, -microglobulin
and the concentration
of total protein (single
proteins/total
protein <0.3) indicatesa prerenal proteinuria.
Immunofixation
will be performed to exclude Bence Jones
proteinunia.
The definitive
reportwillfollow afterthisinvestigation.
16.A distinctselective
glomerular proteinuriawith simultaneous
slight tubular
17. The
permeability
proteinuria
is found.
findings are consistent
or a tububo-interstitial
with impaired
gbomerular
dysfunction
(or both). A
slight increase of the marker proteins
does not necessarily
indicatea renal disease.If clinical
cues are missing,a control
measurement made under standardizedconditions(no intense
physical stress before investigation,
optimal metabolic
and hypertonic
equilibrium
of diabetic and hypertonic
patients)
is
recommended.
18. The findingsare consistentwith a tubulo-interstitial
nephropathy/primary
glomerubopathy/secondary
glomerubopathy.
19.The findingsare consistentwith a primary or secondary
glomerubopathy
(e.g., diabetes mebbitus, hypertension).
20. The findings are consistent
with either (a) a glomeru-.
bopathy with impaired tubulo-interstitiab
reabsorption
or (b) an
interstitial
nephropathy
with secondary
glomerulopathy.
21. The IgC excretion indicates a primary glomerulopathy/
secondary glomerubopathy/tubulo-interstitiab
nephropathy.
22. The increased
excretion
of the tubular
marker
a,microglobubinisthe resultof a tubularoverload (exhaustionof
the tubularreabsorptivecapacity).
23. The extent of interstitial
fibrosis correlates
with the
excretionof the tubularmarker proteina,-microglobubin.
24. The increased excretion of the tubular enzyme (3-NAG
indicates a possible acute disorder of proximal tubular cells.