A pseudo-skeletonization algorithm for static handwritten scripts

Transcription

A pseudo-skeletonization algorithm for static handwritten scripts
IJDAR
DOI 10.1007/s10032-009-0082-z
ORIGINAL PAPER
A pseudo-skeletonization algorithm for static handwritten scripts
Emli-Mari Nel · J. A. du Preez · B. M. Herbst
Received: 16 March 2008 / Revised: 16 November 2008 / Accepted: 11 February 2009
© Springer-Verlag 2009
Abstract This paper describes a skeletonization approach
that has desirable characteristics for the analysis of static
handwritten scripts. We concentrate on the situation where
one is interested in recovering the parametric curve that produces the script. Using Delaunay tessellation techniques
where static images are partitioned into sub-shapes, typical skeletonization artifacts are removed, and regions with
a high density of line intersections are identified. An evaluation protocol, measuring the efficacy of our approach is
described. Although this approach is particularly useful as a
pre-processing step for algorithms that estimate the pen trajectories of static signatures, it can also be applied to other
static handwriting recognition techniques.
Keywords Skeletonization · Thinning · Pseudo skeleton ·
Document and text processing · Document analysis
E.-M. Nel · J. A. du Preez
Department of Electrical and Electronic Engineering,
University of Stellenbosch, Private Bag X1,
Matieland 7602, South Africa
J. A. du Preez
e-mail: dupreez@dsp.sun.ac.za
B. M. Herbst
Department of Applied Mathematics, University of Stellenbosch,
Private Bag X1, Matieland 7602, South Africa
e-mail: herbst@sun.ac.za
E.-M. Nel (B)
Oxford Metrics Group (OMG), 14 Minss Business Park,
Botley, Oxford OX20JB, UK
e-mail: emlimari.nel@gmail.com
1 Introduction
Producing a handwritten signature, or any other handwriting,
is a dynamic process where the pen’s position, pressure, tilt
and angle are functions of time. With digitizing tablets, it is
possible to capture this dynamic information. For some applications, dynamic information is invaluable, e.g. during signature verification dynamic information contains significant
additional biometric information which may not be readily
available to the potential forger. It is therefore not surprising
that efficient and reliable signature verification systems based
on dynamic information exist. There are, however, important
applications where the dynamic information is absent, such
as handwritten signatures on documents, or postal addresses
on envelopes. In these applications, one has access to only
the static images of the handwritten scripts. The absence of
dynamic information in these applications generally causes
a deterioration in performance compared to systems where
the dynamic information is available.
A key pre-processing step in the processing of static scripts
is thinning or skeletonization where the script is reduced to
a collection of lines. A number of algorithms that specialize
in character/word recognition [1–6], or the vectorization of
line drawings [7–10] have been developed. (Note that skeletonization algorithms compute the centerlines (skeletons)
from image boundaries, whereas thinning algorithms remove
outer layers of an image while simultaneously preserving
the image connectivity. In this paper, we describe a pseudoskeletonization approach where we approximate the centerline of the script).
Given the advantages of systems based on dynamic
information it is natural to consider the possibility of recovering dynamic information from static scripts. We are particularly interested in extracting the temporal sequence of
the curve (or curves) of the script from its static image,
123
E.-M. Nel et al.
given different dynamic representatives of the script [11–13].
The pen trajectory of a static script is typically constructed
by assembling the collection of lines that constitute a static skeleton (essentially the curves between intersections).
Most existing techniques rely on some local smoothness criterion to compute the connectivity of lines at intersections;
see, e.g. [14–19]. Finding the necessary connections is generally based on local line directions, and so preserving local
line directions becomes an important attribute of a desirable
skeletonization process.
Local smoothness heuristics that are used to unravel static scripts are typically based on the continuity criterion of
motor-controlled pen motions which assumes that muscular
movements constrain an individual’s hand (holding the pen)
to move continuously, thus constraining the pen to maintain
its direction of traversal [20].
To apply heuristics based on local line directions (or more
generally on the continuity criterion of motor-controlled pen
motions) becomes challenging in areas with a dense concentration of line crossings. This problem is compounded
by skeletonization artifacts; since artifacts are not image features, they introduce more ambiguities. It is therefore not surprising that skeletonization artifacts have a negative impact
on the performance of off-line handwriting recognition systems, as observed by [4,11,15,17,21–23]. In the absence of
a good skeletonization algorithm, it might be appropriate to
follow a different approach by analyzing handwriting directly
from the grey-scale images and therefore circumventing the
skeletonization step [24–26].
The most obvious artifacts for line drawings are peripheral artifacts, such as spurs attached to the skeleton of an
image, and intersection artifacts, where two or more lines
that should intersect fail to cross each other in a single point.
Figure 1a contains a static signature with its skeleton depicted
in Fig. 1b. The skeleton follows the centerline of the original
image, and was obtained through a standard thinning algorithm [27–30]. Note that the skeleton retains the topology of
the original image, where topology refers to the connectivity
of the original image. This skeleton contains many intersection artifacts (some of them are enclosed in dotted circles).
Figure 2a focusses on an intersection artifact (line 2) produced by a typical skeletonization procedure. This artifact
causes a discontinuity in the transition from line 1 to line 3
(indicated by the arrows), i.e., the transition is not smooth.
By removing the intersection artifact (line 2) and connecting
the lines that enter the indicated region in Fig. 2b, a smooth
skeleton can be constructed, as indicated in Fig. 2c. Fixing
intersection artifacts by following local line directions (as
illustrated in Fig. 2) becomes problematic in regions with
multiple intersections. Note, e.g., the ambiguous line directions in the left-hand side of Fig. 1a, which can lead to the loss
of the correct connections between incoming and outgoing
curves, this can be problematic for applications that recover
123
Fig. 1 a A binarized static signature. b Examples of intersection artifacts (dotted circles) in the skeleton that strictly follows the centerline
of a
1
2
(a)
3
(b)
(c)
Fig. 2 a Identifying an artifact (contained in the dashed box) by following the shown line directions (arr ows). b Removing the artifact in
a and c the final skeleton
the temporal information of static handwritten scripts from
their unordered skeleton curves.
One can think of our skeletonization algorithm as a preprocessing step to exploit the continuity criterion of motorcontrolled pen motions: we identify regions of multiple
crossings where this criterion can not be applied reliably,
and remove intersection artifacts so that one can apply the
continuity criterion locally at simple intersections. Although
our skeletonization approach is primarily motivated by the
temporal recovery problem, it is expected that other handwriting recognition applications that rely on the continuity
criterion, can also benefit from our skeletonization approach.
Our skeletonization approach modifies the work of Zou
and Yan [7] to better preserve local line directions, and to
identify complicated intersections. In Fig. 3a the first of two
modifications is illustrated: The algorithm is optimized to
retain the topology of the original image, while removing
many of the artifacts. Note however, that intersection artifacts
A pseudo-skeletonization algorithm for static handwritten scripts
(a)
(b)
(c)
Fig. 3 a Our final conventional and b pseudo skeletons for Fig. 1a.
c Examples of smooth trajectories that can be extracted from b
much smoother than similar trajectories that one can extract
from Fig. 3a. Finally note that these modifications are not
computationally expensive, the complexity of our algorithm
is similar to that of Zou and Yan.
Another problem arises when one has to evaluate the
efficacy of a skeletonization algorithm. A suitable protocol is one that can quantitatively and accurately distinguish
between different algorithms. Existing comparisons concentrate mainly on quantitative aspects such as computation
time and their ability to preserve the topology and geometric
properties of the original image [31,32]. Subjective
tests are also used, where a few skeletons are depicted and
evaluated according to their visual appearance [7–9,33,34].
Plamondon et al. [35] however, compare different algorithms
indirectly through their performance as part of an optical
character recognition system. Lam [32] compares the skeletons with a ground truth in order to determine to what extent
the topology of the original image is preserved. Our approach
uses the same principle. For each static image in our test set,
the dynamic equivalent, referred to as the dynamic counterpart of the image is also recorded and serves as a ground
truth for the skeletonization. The details of the comparison
between the static skeleton and its dynamic equivalent are
nontrivial and is treated in detail in Sect. 3.1. The result of
the evaluation procedure is a local correspondence between
the static skeleton and its dynamic equivalent. This correspondence allows one to determine local skeleton distortions
and leads to a quantitative measure of the efficacy of the
skeletonization procedure.
The rest of this paper is organized as follows: Sect. 2
describes our algorithm in detail, and results are presented in
Sect. 3. Section 4 draws some conclusions.
2 Our pseudo-skeletonization algorithm
are not removed in complicated regions (e.g. the left-hand
side of the signature) as local line directions (used to identify
and fix artifacts) become too unreliable. The resulting skeleton is referred to as a conventional skeleton, as it retains the
topology of the original image.
For applications that are concerned with the temporal
recovery of static handwritten scripts, there is the danger that
the local directions are not always smooth enough in regions
of multiple crossings in the conventional skeletons in order
to trace the curves through these regions. It is therefore also
important to provide a second option where the visual appearance is sacrificed for smoother connections. This is achieved
by a second modification to the Zou and Yan algorithm which
introduces the web-like structures visible in Fig. 3b. Since
this skeleton does not preserve the topology of the original skeleton, it is more accurate to refer to it as a pseudo
skeleton. Smoother trajectories can improve the chances of
a global optimizer (see [11]) to find the correct connections,
as shown in Fig. 3c. Note that the indicated trajectories are
This section describes our algorithm in detail. As mentioned
in Sect. 1, our skeletonization scheme is based on the
algorithm by Zou and Yan [7] henceforth referred to as the
Zou–Yan algorithm. Section 2.1 presents a brief overview
of the extensions we made to the Zou–Yan algorithm.
Section 2.2 describes how the original images are tessellated and how to use the computed sub-shape information
to derive the centerlines of the images. Section 2.3 shows
how to remove skeleton artifacts, and Sect. 2.4 summarizes
our algorithm.
2.1 A brief overview of the modifications to the Zou–Yan
algorithm
The key idea of the Zou–Yan approach is to tessellate an
image into smaller regions so that regional information can
be exploited to identify artifacts. The tessellation is derived
from the Delaunay triangulation of the image. The Delaunay
123
E.-M. Nel et al.
triangulation is a triangulation that maximizes the minimum
angle over all the constructed triangles [36]. The Zou–Yan
algorithm first identifies the edges that represent the boundaries of the original image, where edges refer to lines connecting successive boundary points. These boundary points
form the control points for the Delaunay triangulation of the
original image. This triangulation enables one to compute a
skeleton that follows the centerline of the image [7]. Regions
that comprise artifacts can also be identified through sets of
connected triangles and removed.
End regions are defined as regions that contain skeleton
lines connecting endpoints with other endpoints or crosspoints. An endpoint is a skeleton point connected to only
one adjacent skeleton point, whereas a crosspoint is a skeleton point connected to more than two adjacent skeleton
points. Typically, end regions are likely to constitute peripheral artifacts if they are short in length in comparison with
their width. Spurious end regions are simply removed. Intersection regions contain crosspoints. Multiple crosspoints are
joined in a single point by uniting their corresponding intersection regions, thereby removing intersection artifacts. The
term merging is used to describe the process that unites two or
more intersection regions. Typically, the directions of skeleton lines that enter intersection regions are used as basis
for calculating whether nearby intersection regions should
be merged.
A part of a binary signature is shown in Fig. 4a. In our
algorithm we smooth the image boundaries and compute the
corresponding Delaunay triangles from the smoothed boundaries as indicated in Fig. 4b. It is shown in Sect. 2.2 that such
smoothing greatly reduces artifacts (the Zou–Yan approach
does not apply any smoothing). The skeleton (black line) follows the centerline of the boundaries. Figure 4b also depicts
an intersection region (largest triangle in figure) with its corresponding crosspoint indicated by a grey circle. A typical
endpoint is also indicated by the second black circle, where
its corresponding end region consists of the connected set of
triangles from the triangle containing the crosspoint circle to
the triangle containing the endpoint circle.
(a)
(b)
Fig. 4 a A static image. b The image boundary from a is smoothed
and the boundary points are then used as control points to tessellate
the signature. The final skeleton is shown (black centerline) with two
cir cles indicating a typical crosspoint (gr ey) and an endpoint (black)
123
p
(a)
(b)
(d)
(c)
(e)
Fig. 5 Removal of skeleton artifacts. a–c Removing intersection artifacts by uniting the appropriate intersection regions (dashed boxes).
The directions (arr ows) of the lines that enter the new intersection
region are computed to calculate the crosspoint p where the lines should
join. d–e Peripheral artifacts are removed by removing spurious end
regions (dashed boxes)
Two simple examples are also shown in Fig. 5 to illustrate the basic steps for artifact removal. Figure 5a depicts the
skeleton of an image containing spurious intersection regions
(dashed boxes.) Line directions (arrows) are used as basis for
merging the two intersection regions. Thus, the two regions
in Fig. 5a are merged into a single intersection region, as
shown in Fig. 5b, removing the artifact between them. The
lines that enter this intersection region are interpolated to
cross each other in a single crosspoint p, as shown in Fig. 5c.
Figure 5d shows spurious end regions (dashed boxes) which
are removed to obtain the skeleton in Fig. 5e.
Problems are typically encountered in areas where many
intersection regions are located within close proximity. Such
regions would typically result from the left-hand side of
Fig. 1a. The Zou–Yan algorithm assumes that lines do not
change their orientation after entering an intersection. Due
to the nature of human handwriting, especially signatures,
this is not always the case. When an image becomes indistinct due to multiple crossings in a small region, the lines
that enter the intersection regions are too short to make accurate estimates of local line directions. In such cases it is
not always clear which curves should be connected—if the
skeletonization algorithm follows a dominant curve along
its direction, the wrong curves may be connected, with the
result that important trajectories become irretrievably lost.
It is therefore important to maintain all possible connections, while smoothing lines as much as possible. In short,
one has to avoid merging unrelated intersection regions. This
necessitates further refinements to the basic Zou–Yan algorithm. Accordingly, we impose additional constraints, based
mainly on local line width, introducing the web-like structures mentioned in Sect. 1.
Figure 6 illustrates how our pseudo skeletonization
algorithm deals with complicated regions. Figure 6a shows
A pseudo-skeletonization algorithm for static handwritten scripts
Fig. 6 a A binary image
extracted from Fig. 1a of an
intersection region where
multiple lines cross each other.
b Losing directional information
due to merging, and
c introducing web-like
structures and less merging to
maintain more possible line
directions in complicated
regions
2
1
3
4
(a)
a complicated intersection extracted from Fig. 1a. Figure 1b
illustrates what can happen if merging is not carefully controlled in such regions (a typical problematic region for the
Zou–Yan algorithm). The dashed arrows 1 and 2 indicate line
directions that have been enhanced through the merging process. However, close inspection of all the line directions in
Fig. 1a reveals that the directions of the straight lines indicated by the arrows 3 and 4 are damaged, complicating the
extraction of the correct lines in a stroke recovery algorithm.
The difficulties of using heuristics in these regions should
be clear. The result from our skeletonization algorithm is
shown in Fig. 6c, where merging is controlled by introducing web-like structures to allow more choices of possible line
directions. The correct connections can now be extracted by
a more comprehensive stroke-recovery algorithm that takes
global line directions into account.
Since there is a direct relationship between the noise in
the boundaries of an image and the number of artifacts in
the image skeleton, we apply a smoothing procedure to the
original boundaries as well as the final skeletons. The details
of the algorithm are described in the following two sections.
2.2 Shape partitioning
Image boundaries are extracted in the first step to tessellate static handwritten scripts, as illustrated by Fig. 4. The
discrete points that constitute the image boundaries become
the control points of a polygon that represents the static
script. We refer to the boundary polygon as the approximating polygon of the script. Since these boundaries are noisy
(see Fig. 4a), smoothing is necessary:
1. Various techniques exist to smooth parametric curves.
First we use the approach in [29], producing a simplified parameterized representation of a line image. Note
that we reduce the number of polygon control points in
regions where the curvature is low in order to reduce the
number of points to process. A low-pass filter is then
applied to the resulting boundaries for further smoothing [28]. Excessive smoothing, however, will remove
regions of high curvature. This can be problematic for
thinner signatures as excessive smoothing allows outer
boundaries to cross inner boundaries, as illustrated in
(b)
(c)
Fig. 7. The signature has one outer boundary (solid lines)
and a few inner boundaries (dashed lines). In order to
ensure that the inner boundaries do not cross the outer
boundaries, smoothing is applied iteratively. After each
iteration it is verify that no boundaries overlap. Figure 7c
depicts the smoothed boundary derived from Fig. 7b.
2. Image boundaries that enclose three or less connected
pixels are considered insignificant and removed.
3. For the sake of simplicity, all boundary points are interpolated to fit on a regular image grid.
The effect of boundary smoothing is illustrated in
Fig. 7d–e, where (d) depicts a static signature. The skeleton
computed directly from its boundary is shown in (e), and the
skeleton computed from its smoothed boundary in (f). Note
the significant reduction of spurs as a result of a smoother
boundary. The few remaining spurs are simply removed from
the skeleton.
In the next step, we use the boundary points of the static
image as control points for a Delaunay triangulation that can
be used to compute the centerline (skeleton) of the static
image.
The Delaunay triangulation of a set of points P is the
straight-line dual of the Voronoi diagram of P. The Voronoi
diagram V of P is a uniquely defined decomposition or tessellation of a plane into a set of N polygonal cells, referred
to as Voronoi polygons [29]. The polygon cell Vi contains
one point pi ∈ P and all points in the plane that are closer
to pi than any other point in P, i.e., ∀p j ∈ P, j = i, q ∈ Vi
if dist(q, pi ) < dist(q, p j ). The edges defining a Voronoi
polygon are generated from the intersections of perpendicular bisectors of the line segments connecting any one point
to all its nearest neighbors in P [29]. A typical Voronoi diagram V of a set of points (black dots) is shown in Fig. 8a.
The Delaunay triangulation can now be computed from V by
connecting all pairs of points that share the same edge [37],
as illustrated in Fig. 8b.
In order to proceed we need to recall some concepts
from [7,8,34]:
• External triangles occur because the Delaunay triangles
are generated inside the convex hull of the object. This
123
E.-M. Nel et al.
Fig. 7 a A static signature with
b its outer boundaries
(solid lines) and its inner
boundaries (dashed lines).
c Smoothing the boundaries
in b. d A static signature with
e its skeleton computed directly
from its boundary. f A skeleton
computed from the smoothed
boundary of d
(a)
(b)
(c)
(d)
(e)
(f)
E-T
N-T
E-T
(a)
(b)
N-T
J-T
E-T
I-T
(c)
(d)
Fig. 8 a The Voronoi diagram and b the Delaunay triangulation (straight-line dual of a) of some control points (dots). c Vertices/controlled
points ( f illed dots), external edges (solid lines) and internal edges (dashed lines) are used to label internal triangles. d The primary skeleton
(centerline) of c
•
•
•
•
can produce triangles outside the approximating polygon.
These are simply removed.
Internal triangles are the Delaunay triangles inside the
approximating polygon of a static image. Internal triangles are identified by constructing a ray (half-line) from
the centroid of a particular triangle in any direction, so that
the ray does not pass through any vertices of the approximating polygon. The ray originates inside an internal triangle if the ray intersects the edges of the approximating
polygon an odd number of times [38].
External edges are the sides of internal triangles that coincide with the image boundaries.
Internal edges are triangle edges inside the approximating polygon of the image. Note that two adjacent internal
triangles have a common internal edge.
Internal triangles having zero, one, two, or three
internal edges are labelled isolated-triangles (I-Ts),
end-triangles (E-Ts), normal-triangles (N-Ts) and junction-triangles (J-Ts), respectively. Figure 8c shows examples of each triangle type, where the black dots indicate
123
the boundary points of the original image. The control
points/dots form the vertices of the Delaunay triangles.
External edges are rendered as solid lines, whereas internal edges are rendered as dashed lines.
A primary skeleton, coinciding mostly with the centerline
of the original image, is obtained as follows: for N-Ts, the
skeleton connects the midpoints of their internal edges. For
E-Ts and J-Ts, the skeletons are straight lines connecting their
centroids to the midpoints of their internal edges, whereas the
skeletons for I-Ts are their centroids. The skeleton derived
from internal triangles of Fig. 8c are shown in Fig. 8d.
2.3 Removing artifacts
Parts of handwritten static scripts that are difficult to unravel,
as well as intersection and peripheral artifacts, are identified
by means of a parameter α, which is the ratio between the
width w and length of a ribbon, i.e., α = w , where a ribbon
is a set of connected N-Ts between two J-Ts, or between a J-T
A pseudo-skeletonization algorithm for static handwritten scripts
J-T
b
d
h
a
c
f i
E-T
(a)
g
e
(b)
Fig. 9 Removing peripheral artifacts. a An illustration of the parameters involved to determine if the ribbon between the J-T and E-T is
spurious. Vertices that are used to compute the width of the ribbon are
labelled from a to g, whereas the length of the ribbon is the path length
and an E-T. A long ribbon is identified when α ≤ α0 , whereas
a short ribbon is identified when α ≥ α0 . Our skeletonization algorithm consists of seven steps. Different thresholds α0
are used during each of these steps, as empirically optimized
using the signatures from the Dolfing database [39,40] and
the Stellenbosch dataset developed by Coetzer [41]. These
signatures form our training set and vary significantly in line
thickness. A different test set is used for quantitative measurements; see Sect. 3.3. The width w of a ribbon is taken
as the trimean length over all internal edges that constitute
the ribbon. The trimean, also known as Tukey’s trimean or
best easy systematic (BES) estimate, is a statistically resistant measure of a distribution’s central tendency [42], and
is computed as the weighted average of the 25th percentile,
twice the 50th percentile and the 75th percentile. The length
is the path length of the skeleton associated with the ribbon.
Figure 9a shows a typical ribbon between an E-T and a J-T.
The length of the ribbon is computed as the path length of
the skeleton line that connects the midpoints of the internal
edges from h to i. The width w of the ribbon is given by
the trimean of {ab, bc, cd, ce, ef, fg}, where
xy = y − x and x, y are both 2D boundary coordinates. The
algorithm proceeds in several steps:
Step 1: Removing spurs. The first step in the skeletonization
is to remove all peripheral artifacts remaining after boundary smoothing. Following [8], short spurs belong to sets of
connected triangles that are short in comparison with their
width; they are removed. Recall that a short ribbon is identified where α ≥ α0 . In this case, α0 = 2 to identify a short
ribbon. From this, the J-T in Fig. 9b becomes an N-T.
The threshold α0 depends on the boundary noise–less
boundary noise results in shorter spur lengths. Thus, α0
increases as the boundary noise decreases. Figure 9c and
d show the result after Step 1 is applied to Fig. 7e and f
with α0 = 2. Note that most of the important image features
from Fig. 7d are preserved in Fig. 9d, whereas it becomes
difficult to calculate α0 for Fig. 7c to simultaneously remove
spurs without removing important image features. Clearly,
boundary smoothing significantly improves spur removal as
spurs are shortened in a natural way, making it easier to compute a robust value for α0 .
(c)
(d)
of the skeleton line between h and i. b Result after removing the spurious end region from a. c, d Results after removing spurs from Fig. 7e,
f, respectively
J-T
J-T
4,3
1,2
J-T J-T
J-T
J-T
2,1 3,2
6,1
J-T
5,2
7,2
(a)
(b)
(c)
(d)
Fig. 10 Identifying complicated intersections. a Cluttered J-Ts
extracted from a complicated part of a signature. b Illustration of Step 2,
where J-Ts are numbered, followed by the number of long ribbons
that are connected to them. c Conventional and d pseudo skeletons for
b superimposed on the internal Delaunay triangles
Step 2: Identifying complicated intersections. Figure 10a
indicates the typical locations of J-Ts, as derived from a complicated part in a signature. If many lines cross in a small area,
it is difficult, if not impossible, to maintain the integrity of
lines, i.e., it is difficult to follow individual ribbons through
intersections. The Delaunay triangles enable us to identify
region where many J-Ts are within close proximity, as shown
in Fig. 10a. Instead of forcing poor decisions in such complicated parts, the primary skeletons are retained in these parts
for conventional skeletons. For pseudo skeletons, web-like
structures are introduced, resulting in additional intersections
to model multiple crossings.
Recall that during the primary skeletonization, the centroids of all J-Ts become skeleton points. As mentioned
above, it is important to avoid forcing decisions in complicated parts of a static script. Hence, for complicated intersections in conventional skeletons, the centroids of J-Ts are
123
E.-M. Nel et al.
their final skeleton points. For pseudo skeletons, the primary
skeleton points of the J-Ts are removed, and the lines that
enter the J-Ts are directly connected. The resulting web-like
structures contribute to smoother connections than the original primary skeleton points. We proceed to discuss the heuristic measures employed to identify J-Ts that belong to such
complicated regions.
Firstly, J-Ts that are connected to two or three short ribbons, are labelled complicated J-Ts. In this case, α0 = 2.5 is
used to identify short ribbons. The primary skeleton points of
complicated J-Ts are replaced by lines connecting the midpoints of the J-T internal edges. The same is done for other
J-Ts that are connected to complicated J-Ts through short
ribbons. This is illustrated in Fig. 10b, where the J-Ts from
Fig. 10a are numbered, followed by the number of long ribbons connected to the J-Ts. Note that J-Ts 1 and 7 are also
labeled as complicated J-Ts although they are both connected
to two long ribbons, as they are connected to complicated
J-Ts through short ribbons. For conventional skeletons the
primary skeleton points of all complicate J-Ts are retained,
as shown in Fig. 10c. For pseudo skeletons the primary skeleton points are replaced with web-like structures, as shown
in Fig. 10d. Note that the skeletonization algorithm differs
only in Step 2 for our conventional and pseudo skeletons.
Step 3: Labeling of skeleton points. The remaining simple
J-Ts are either connected to two or three long ribbons. The
skeleton points of such simple J-Ts (recall that the primary
skeleton selected the centroid) are recalculated following a
similar approach to [7] as briefly described below.
Recalculating the skeleton points of simple J-Ts. Recall
from Sect. 2.1 that a crosspoint is a skeleton point that is
connected to two or more adjacent skeleton points and is
associated with an intersection region. Thus, in our case, the
crosspoint pi is the skeleton point associated with J-Ti . At
this point pi is initialized to be the centroid of J-Ti .
The midpoints of internal edges belonging to the first few
triangles (we use 13) in all three ribbons connected to a J-T
are connected, as shown in Fig. 11a. The average directions of
these curves are calculated. Straight lines are then computed
passing through the midpoints of the J-T in the calculated
directions, as illustrated by the dashed lines in Fig. 11b. All
the points where the straight lines intersect are calculated and
averaged. The crosspoint pi is then replaced by this average
intersection point pi , as indicated by the circle in Fig. 11b.
Recalculation of skeleton points that are out of
bounds. In some cases pi falls outside the J-Ti and all the
ribbons connected to J-Ti i.e., in the image background. To
preserve local line directions, pi is relocated to an appropriate
triangle closest to it. Specifically, for each ribbon j connected
to J-Ti , the nearest triangle Ti j to pi is computed. Thus, Ti j
can be any triangle that forms part of the jth ribbon connected to J-Ti for j ∈ {1, 2, 3}. For each Ti j , the angle θi j is
computed, where
123
J-Ti
J-Ti
x
(a)
pi
(b)
Fig. 11 Calculating crosspoints for simple intersections. a Internal
edge midpoints are connected to estimate the local directions of the
ribbons that enter J-Ti . b The local ribbon directions (arr ows) are
extended (dashed lines) to compute the skeleton point pi of J-Ti in a
θi j = cos−1
pi − pi j · pi − pi
,
pi − pi j · pi − p (1)
i
where pi j is the centroid of Ti j . The triangle T(i j)min corresponding to the minimum θi j is chosen as the triangle that
should contain pi . If T(i j)min is an E-T, pi , the replacement
of pi , is the centroid of the E-T. If T(i j)min is an N-T, pi is the
centroid of the midpoints of the N-T’s two internal edges.
In summary, the skeleton point pi for each simple J-Ti is
recalculated as pi = pi , or pi = pi (if pi is out of bounds).
Associating a ribbon with each crosspoint. Now that the
skeleton point of J-Ti is known (again denoted by pi ), one
of the three ribbons attached to J-Ti is associated with pi for
future use. The distances between pi and the midpoints of
J-Ti ’s internal edges are calculated. The ribbon with the nearest terminating internal edge j is associated with pi . For
example, the ribbon connected to edge x in Fig. 12a is associated with pi .
Step 4: Removing intersection artifacts (criterion 1). This
step identifies which of the simple J-Ts contribute to intersection artifacts, by adapting some of the criteria used by
[7]. A J-Ti is labelled unstable, i.e., contributing to an artifact, if its skeleton point pi lies outside it. In this case, the
sequence of connected triangles from J-Ti up to the triangle
in which pi falls, are removed, thereby merging them into
a single polygon. Figure 12a illustrates that J-Ti is unstable since its skeleton point pi falls outside it. The intersection region resulting from the removal of all the triangles
up to pi is shown in Fig. 12b. Note that pi is now associated
with a pentagon (five-sided polygon rendered as dotted lines).
The skeleton is now updated by connecting the midpoints of
the two remaining internal edges of the original J-Ti to pi .
The directions of the lines that enter the intersection region
(polygon) are more similar to the directions of the skeleton
lines connecting them to pi than in Fig. 12a. Thus, in this
case, the transitions to crosspoints are smoothed. The ribbon
associated with pi is also shorter than in Fig. 12a as its length
is reduced by the number of triangles that were removed.
A pseudo-skeletonization algorithm for static handwritten scripts
1
p1
pi
pi
p
x
p2
J-Ti
2
(a)
(b)
Fig. 12 a The skeleton point pi (cir cle) of J-Ti falls outside it. b All
the triangles from J-Ti to pi are removed from a to calculate a new
intersection region (dotted polygon) containing the crosspoint pi
J-T2
(a)
Fig. 14 a Skeleton points p1 and p2 of intersection regions 1 and 2
(numbered dotted polygons) are associated with the same short ribbon. b The short ribbon and intersection regions from a are united into a
new intersection region (dotted polygon with skeleton point p), thereby
removing the intersection artifact
p
J-T1
(a)
(b)
(b)
Fig. 13 Removing an intersection artifact by extending Step 4. a An
intersection artifact (solid line between J-T1 and J-T2 ) and b removal
thereof by uniting J-T1 and J-T2 into a new intersection region
(dotted polygon) with skeleton point p
An extension of Step 4 is illustrated in Fig. 13. The primary
skeletons of J-T1 and J-T2 in Fig. 13a must be joined to
remove the intersection artifact (solid line between J-T1 and
J-T2 ). In this case, the skeleton point p2 of J-T2 falls inside
J-T1 , or similarly, p1 falls inside J-T2 . We therefore unite the
two J-Ts into a four-sided polygon (dotted r ectangle) with
skeleton point p (cir cle), as shown in Fig. 13b, where p =
(p1 + p2 )/2.
Note that the primary functions of Step 4 are to improve
y-shaped patterns and to fix simple x-shaped patterns that
contain artifacts. More intersection artifacts are identified
and removed during the next step.
Step 5: Removing intersection artifacts (criterion 2). We
now make use of the information about the location of skeleton points and their associated ribbons obtained in Step 3.
After the application of Step 4, it often happens that the same
short ribbon (α0 = 2) is associated with two crosspoints p1
and p2 , as shown in Fig. 14a. In this case, the two intersection
regions and the ribbon between them are united into a new
intersection region, as shown in Fig. 14b. Note that, after the
application of Step 4, a ribbon that is connected to an intersection region (triangle/polygon) must terminate in either an
intersection region or an E-T. The skeleton point for the new
intersection region (dotted polygon) is p = (p1 + p2 )/2.
p1
J-T1
J-T2
p2 p3
p
J-T3
(a)
(b)
Fig. 15 Merging three J-Ts, where a J-T2 must merge with J-T3 and JT1 according to the locations of the J-T skeleton points p1 , p2 and
p3 and the criteria imposed by Steps 4 and 5. b A new intersection region (solid lines) with skeleton point p and skeleton lines
(thin dashed lines) connecting the midpoints of its internal edges
results after merging the J-Ts from a
In addition, three J-Ts must sometimes be merged, as illustrated in Fig. 15. Figure 15a depicts three J-Ts and their skeleton points p1 , p2 and p3 . Conditions for such a merge occur
if, according to Step 4, J-T2 and J-T3 must be united, whereas
according to Step 5, J-T1 and J-T2 must be united. In such
cases, a new intersection region is created with a single skeleton point p = (p1 + p2 + p3 )/3, as shown in Fig. 15b.
Step 6: Removing spurs by reapplying Step 1. The removal of
intersection artifacts may shorten some of the ribbons since
triangles are typically removed to relocate crosspoints. The
occurrence of peripheral artifacts can consequently be reevaluated. If a crosspoint p is associated with a new short
ribbon (α0 = 2.5) and the short ribbon is connected to an
intersection region and an E-T, the short ribbon is removed.
Step 7: Final skeletons are smoothed using Chaikin’s cornercutting subdivision method [43,44]. This smoothing scheme
treats the points that constitute a parametric curve as control
points of a polygon and iteratively “cuts” the corners of the
polygon while doubling the numbers of points that constitute the curve. It is shown by Lane and Riesenfeld [44] that
Chaikin’s algorithm converges to a quadratic B-spline curve.
123
E.-M. Nel et al.
Due to Chaikin’s geometric approach to smooth curves, a
wide variety of shapes can be handled easily and efficiently,
e.g., straight lines and closed curves are treated in the same
manner. Skeletons are represented as sets of curves that are
connected to endpoints and/or crosspoints, as well as closed
curves (e.g., the character “o” which contains no crosspoints
or endpoints.) Chaikin’s corner-cutting subdivision method
is then applied to each curve.
2.4 Algorithm summary
Our skeletonization algorithm is briefly summarized as
follows:
taneously. The problem is to compare the skeleton, derived
from the static image, with its dynamic counterpart. Since the
skeleton and its dynamic counterpart are obtained through
different processes, at different resolutions, a direct, pointwise comparison is not possible. In addition, the skeleton is
derived from a static image that typically contains various
artifacts and/or other defects in the image such as broken
lines caused by dry ink or too little pressure. A technique to
compare a skeleton with its dynamic counterpart is described
in Sect. 3.2, allowing us to quantify the performance of our
algorithm, as described in Sect. 3.3. Examples of typical skeletons extracted with our algorithm are presented in Sect. 3.4.
3.1 The US-SIGBASE database
1. The boundaries of static handwritten images are
extracted, smoothed and resampled. Small polygons are
removed, while the rest of the polygons are subdivided
into non-overlapping Delaunay triangles. These triangles are used to calculate the primary skeletons of static
images.
2. Peripheral artifacts are removed by removing short
ribbons (α0 = 2) connected between J-Ts and E-Ts.
3. Parts of the signature that are difficult to unravel are identified. Here α0 = 2.5. Either the primary skeletons are
retained or web-like structures are introduced in these
parts, depending on the application.
4. Unstable J-Ts and other intersection regions that contribute to intersection artifacts are identified. Intersection artifacts are corrected using two criteria. The first
criterion merges a J-T with a connected set of triangles
up to its estimated skeleton point. The second criterion
unites two intersection regions if their skeleton points are
associated with the same short ribbon (α0 = 2.)
5. The last peripheral artifacts are identified after the recalculation of crosspoints in the steps above. Here α0 = 2.5.
6. Skeletons are smoothed using a corner-cutting subdivision scheme.
3 Results
As mentioned in Sect. 1, a specific application of our skeletonization algorithm is to serve as a preprocessing step
for extracting dynamic information from static handwritten
scripts. To assess the efficacy of our skeletonization algorithm we therefore measure the correspondence between a
skeleton and its dynamic counterpart. For testing purposes
we collected a signature database, US-SIGBASE [11]. More
details are given in Sect. 3.1, see also [13]. These signatures
are available at [45].
All signatures in the US-SIGBASE were obtained by signing on paper placed on a digitizing tablet. A paper signature
and its dynamic counterpart were therefore recorded simul-
123
The US-SIGBASE database is used to test our algorithm.
This database consists of 51 static signatures and their
dynamic counterparts. All signatures were recorded on paper
placed on a Wacom UD-0608-R digitizing tablet. Each individual was instructed to sign within a 50 mm × 20 mm bounding box using a medium-size ball-point pen compatible with
the digitizing tablet. The paper signatures were scanned as
gray-scale images at 600 dpi and then binarized. To reduce
spurious disconnections (broken lines), the scanner was set
to a very sensitive setting which introduced significant background noise. The noise was reduced by applying a median
filter [29] followed by a low-pass filter [28]. After filtering, the document was binarized with a global threshold set
by the entropy method described in [29]. The line widths of
the resulting static signatures vary between eight and twelve
pixels in parts where the lines do not intersect.
Figure 16a depicts a typical grey-scale grid extracted from
a page in US-SIGBASE. The document’s binarized version,
after filtering, is shown in Fig. 16b. Note that the signatures
in Fig. 16b appear thicker than in Fig. 16a. This is due to the
binarization and has the advantage of reducing the number of
broken lines, but makes dense intersections harder to unravel.
One expects the dynamic counterparts to be perfect skeletonizations of the static images. After the recording procedure
we therefore aligned all the images with their dynamic counterparts, see [46]. The alignment presents a few difficulties.
Firstly, some individuals seem to be unable to sign within
a grid’s constraints, writing not only over the grid lines, but
also over neighboring signatures. In these cases, it is impossible to extract the individual signatures without corrupting them. Secondly, the recording devices introduce several
different types of noise, corrupting the data [47–49]. (We use
a Hewlett–Packard (HP) scanjet 5470C.) Thirdly, digitizing
tablets fails to capture signatures if insufficient pressure is
applied, leading to spurious broken lines. Finally, differences
in orientations between the tablet, the paper on the tablet, and
the paper on the scanner’s surface need to be considered. If
the paper shifts on the tablet while the signature is recorded,
A pseudo-skeletonization algorithm for static handwritten scripts
Fig. 16 An example of a
typical scanned document in the
US-SIGBASE database. a Two
static signatures on a grey-scale
document and b their binarized
representations
discrepancies between the static signature and dynamic counterpart occur that is difficult to correct afterwards. The alignment was obtained by a brute-force, exhaustive search over
various similarity transforms between the static signatures
and their dynamic counterparts. If no satisfactory alignment
was obtained the signature was discarded. We emphasize that
this is for evaluation purposes only where we need a groundtruth in order to obtain a quantitative comparison.
3.2 Comparing a skeleton with its dynamic counterpart
To establish a local correspondence between a skeleton
and its dynamic counterpart, a similar approach to Nel et al.
[11–13] is followed. That is, a hidden Markov Model (HMM)
(see [50,51], for example) is constructed from the skeleton.
Matching the dynamic counterpart to the HMM of the static
signature, using the Viterbi algorithm, provides the optimal
pointwise correspondence between the static skeleton and its
dynamic counterpart.
The dynamic counterpart consists of a single parametric
curve as well as the pen pressure, used to identify pen-up
events. Thus, each coordinate xt of the dynamic counterpart
X = [x1 , x2 , . . . , xT ] consists of three components—two
pen position components (xt , yt ) written as xt1,2 , and one pen
pressure component xt3 , where xt3 = 0 at pen-up events, and
xt3 = 1 otherwise, and t = [1, . . . , T ]. The Viterbi algorithm
yields a state sequence s = [s1 , . . . , sT ] representing the
optimal match between the HMM of the static skeleton and
its dynamic counterpart X. Following [11–13], each state in
the HMM is associated with a unique skeleton point pi for i ∈
{1, . . . , N }, where N is the number of skeleton points. Thus,
the final state sequence s = [s1 , . . . , sT ] from the Viterbi
algorithm yields the pointwise correspondence between the
skeleton points P = [ps(1) , ps(2) , . . . , ps(T ) ] and the skele-
ton’s dynamic counterpart X = [x1 , x2 , . . . , xT ], where ps(t)
is the skeleton point associated with state s(t).
The local correspondence computed by the Viterbi
algorithm can now be used to accurately determine the positions of local distortions in the skeleton, as shown in the next
section.
3.3 Quantitative results
To generate quantitative results, the skeletons of the static
scripts are computed, as described in Sects. 2.2, 2.3. Optimal pointwise correspondences are then established between
the skeletons and their dynamic counterparts using the procedure described in Sect. 3.2. For each corresponding point,
the difference in pen positions is computed as
et = xt1,2 − ps(t) ,
(2)
where xt1,2 indicates the first two components (x and y coordinates) of the tth coordinate of the dynamic counterpart xt ,
and ps(t) is the corresponding 2D coordinate of the skeleton coordinate as computed by the Viterbi algorithm for
t = [1, . . . , T ]. Note that et is measured in pixels.
Similarly, the difference in pen direction (normalized
velocity) is computed as
1,2
x − x1,2
ps(t) − ps(t−1) t
t−1
(3)
−
vt = ,
xt1,2 − x1,2 ps(t) − ps(t−1) t−1
for all values of t = [2, . . . , T ] where the pressure is non3 . Note that the maximum diszero, i.e., xt3 = 1 = xt−1
√
2 pixels.
Thus,
tance between two successive
coordinates
is
√
√
1,2
∈ {1, 2} and pi − p j ∈ {1, 2} so that
xt1,2 − xt−1
vt ∈ {0, 2}.
To obtain a global comparison, the median dP of all the
values for et from (2) is calculated. The value of dP is an
123
E.-M. Nel et al.
Table 1 A comparison between a standard thinning algorithm and our
conventional skeleton: average position and direction errors (as defined
by (2) and (3)) together with their standard deviations σ
Position (dP )
σ (dP )
Direction (dD )
σ (dD )
Baseline
1.88
0.39
0.2
0.03
Pseudo
1.94
0.42
0.08
0.03
Table 2 A comparison between a standard thinning algorithm and our
conventional skeleton: the average number of crosspoints, expressed as
a percentage of the actual number (occurring in the dynamic counterpart), and the standard deviation thereof
Crossings (n C )
indication of how closely the position coordinates of the
skeleton and its dynamic counterpart are aligned. Similarly,
the median dD of all the values for vt from (3) is calculated.
This provides an indication of how well the local line directions of the skeleton and its dynamic counterpart are aligned.
To obtain an indication of the number of intersection artifacts
in the skeleton, the number of crosspoints n C in the conventional skeleton is computed and expressed as a percentage of
the number of crosspoints in the dynamic counterpart. This
is then compared with the number of crosspoints that occur
in a skeleton obtained from the standard thinning algorithm
described in [28]. Although by no means state-of-the-art, it
is readily available and similar algorithms are used by many
existing temporal recovery techniques. It should therefore
provide a reasonable base line for comparison. In addition,
being a thinning as opposed to a skeletonization algorithm,
it should produce less peripheral artifacts than standard skeletonization algorithms. Final results are computed by taking
the averages and standard deviations of dP , dD and n C over
the 51 users.
We choose the thinning algorithm described in [28] as
our baseline, which is compared with our pseudo skeleton
as shown in Table 1. From the values of dP and σ (dP ) it is
clear that, on average, there is little to choose between the
two methods as far as pen position is concerned. Note however, the difference in the line directions, as measured by dD .
Recalling that the maximum value of dD is 2.0. Thus, the
pseudo skeleton represents a 6% absolute improvement and
a relative improvement of 60%.
The average number of crosspoints n C in the baseline
skeletons are compared to that of our conventional skeletons in Table 2, where n C is expressed as a percentage of the
number of crosspoints in the dynamic counterpart. There are
considerably less crosspoints in our conventional skeletons
as compared with the baseline skeletons, and the standard
deviation is also considerably less. This indicates that our
Table 3 Experimental results
for different average pen widths
(measured in pixels): positional
and directional errors (as
defined by (2) and (3)), and the
number of crosspoints with their
standard deviations
123
Width
Baseline
5
σ (n C )
Baseline
481.1%
191.1%
Conventional
158.5 %
83.9%
conventional skeletons contain significantly less intersection
artifacts compared to the baseline skeletons.
An experiment to determine the sensitivity to pen width
was conducted with synthetic data. The dynamic counterparts of all 51 users were artificially thickened using a discshaped structuring element in a simple dilation operation.
Using such a synthetic set ensures that the line widths of the
signatures are similar for each dilation iteration, while also
preserving the exact dynamic counterparts. The results for
two significantly different pen widths are shown in Table 3.
The average number of crosspoints in our conventional skeletons is closer to that of the dynamic counterparts as compared
with the baseline skeletons for all the experiments. Note that
the standard deviations σ (dP ), σ (dD ) and σ (n C ) increase for
our pseudo skeletons for the thicker signatures. Thicker signatures in general have more complicated regions with the
result that fewer artifacts are removed.
In [11] the same baseline and pseudo-skeletonization algorithms were used for experimental evaluation as in this paper.
In that paper, the skeletonization algorithms were used as
preprocessing steps before unraveling static signatures and
we specifically exploited the continuity criterion of motorcontrolled pen motions at relatively simple crossings by
unravelling them prior to introducing a global optimizer to
unravel the rest. Those results confirm the findings in the present paper, i.e., the present algorithm provides smoother skeletons with less intersection artifacts compared to the baseline
method. Animations illustrating the local correspondences
between the skeletons and their dynamic counterparts can be
found on our website [45].
3.4 Examples
In this section, we present a number of typical examples.
Position
(dP )
σ (dP )
Direction
(dD )
σ (dD )
Crossings
(n C , %)
σ (n C , %)
0.5
0.04
0.19
0.04
717
320
Pseudo
5
0.5
0.05
0.02
0.01
139.5
37.6
Baseline
16
1.0
0.6
0.22
0.04
653.3
346.4
Pseudo
16
1.3
0.5
0.09
0.05
150.4
75.8
A pseudo-skeletonization algorithm for static handwritten scripts
Fig. 17 Example signatures
that were used to determine the
optimal value of α with a the
original signatures, b their
pseudo skeletons and c their
conventional skeletons
1
2
3
4
5
6
7
8
9
10
11
(a)
The value of α was empirically determined through visual
inspection where we looked for visual attractiveness (smooth
skeletons with few artifacts) while preserving all the possible trajectories. Two datasets were used: the first database
was collected by Dolfing and consists of dynamic signatures,
where static images were obtained by dilating the dynamic
signatures with a 3×1 structuring element of ones. The resulting static images were therefore free from any contamination such as smears on the paper and scanner noise. The
second dataset consists of scanned static signatures captured
(b)
(c)
with regular pens, as collected by Coetzer [41]. Although the
signatures vary considerably as far as line width and boundary
noise levels are concerned, the same threshold value for α,
as presented in Sect. 2.3, is used in all cases. One might consider computing α more robustly through an automatic optimization algorithm. The algorithm in [13] can, e.g. be used
to determine the accuracy of a pen trajectory that is reconstructed from a static skeleton, given a specific value of α.
Examples of the original signatures and their skeletons
from the test signature set are shown in Fig. 17.
123
E.-M. Nel et al.
Fig. 18 a A complicated static
image to unravel, with b its
dynamic counterpart, c its static
skeleton and d the skeleton lines
that match the dynamic
counterpart in b
Signatures 1, 10 and 11 are examples of signatures that
do not have complicated intersections and are free from any
web-like structures in their pseudo skeletons. The pseudo and
conventional skeletons for these signatures are therefore the
same. Recall that the first smoothing stages of our algorithm
reduces artifacts, as shown in Fig. 9. Note, however, how
local line directions are improved in the skeleton of signature 1 after the application of Steps 2–6 as compared with
Fig. 9d. Furthermore, web-like structures preserve all possible connections while smoothing local directions in regions
of multiple intersections (regions that are difficult to unravel).
The pseudo skeletons of signatures 5 and 8 illustrate that
our algorithm is able to identify these difficult parts (evident
from the webs on the left-hand parts), whereas intersection
and peripheral artifacts are corrected in parts that are relatively straightforward to unravel (right-hand parts.) Note
that the conventional skeletons retain the topology of the
original images, whereas the pseudo skeletons have web-like
structures, i.e. more “holes”, but smoother intersections in
complicated regions. Unravelling algorithms can make use
of the smoother paths and the additional local transitions.
Unused web-like structures can be discarded after global
optimization.
The thicker the pen width, the more difficult it becomes
to unravel a static signature. A complicated thick signature
from US-SIGBASE is depicted in Fig. 18a, with its dynamic
counterpart (shown in white) superimposed on it in Fig. 18b.
Note that the dynamic counterpart is well aligned with the
123
(a)
(b)
(c)
(d)
static image, using the technique described in Sect. 3.1. The
pseudo skeleton (with the web-like structures), is shown in
Fig. 18c. Our quantitative evaluation protocol described in
Sect. 3.2 is used to compute a pointwise correspondence
between the dynamic signature in Fig. 18b and the skeleton in Fig. 18c. The resulting skeleton lines that match the
dynamic counterpart according to this protocol are shown in
Fig. 18d. Note that some web-like structures are discarded
compared to Fig. 18c as part of the unravelling process.
4 Conclusions
The main purpose of our modifications to the skeletonization algorithm of Zou, Yan and Rocha [7,34], is to produce
a skeletonization that is close, ideally identical, to the pen
trajectory that produced the handwritten script. Three main
contributions of these modifications are:
1. It is shown that smoothing of image boundaries prior
to skeletonization reduces the number of artifacts and
enhances local line directions, this simple operation can
be applied to any skeletonization algorithm. We automatically identify complicated regions with a large number of intersections where heuristics that rely on local
smoothness constraints can become unreliable. In such
regions, artifact removal can be detrimental. For this reason, a conventional skeleton was developed that retains
A pseudo-skeletonization algorithm for static handwritten scripts
the topology of the original image, where the topology
refers to the connectivity of the original image. This skeleton is computed from a smooth image boundary where
artifacts are removed only where line directions are reliable (i.e., not in complicated parts of the image). This
skeleton is useful for applications that can benefit from
visually attractive skeletons (smooth skeletons with few
artifacts), such as line drawings of static images, where it
is also important to maintain the topology of the original
image.
2. It is shown how one can modify the conventional skeletons to benefit techniques that recover dynamic information from static handwriting. Specifically, we introduce
web-like structures in complicated regions that introduce
additional local transition options, which would otherwise have been destroyed by conventional skeletonization algorithms. A global temporal recovery algorithm
can then use the additional information and discard
unused options. The resulting skeletons are referred to
pseudo skeletons as they do not strictly preserve the topology of the original image.
3. An evaluation protocol is introduced that allows a detailed
comparison of the skeleton with a dynamic ground truth
sequence in the form of a parametric curve. Local distortions of the skeletonization are readily identified.
Detailed tests, using the evaluation protocol indicate that
our skeletons are, in general, well aligned with the
ground-truth, parametric curves. In particular, a significant improvement in the local line directions is observed,
especially when compared to traditional skeletons.
References
1. Lam, L., Suen, C.Y.: An evaluation of parallel thinning algorithms
for character-recognition. IEEE Trans. Pattern Anal. Mach. Intell.
17(9), 914–919 (1995)
2. Stheinherz, N.I.T.: A special skeletonization algorithm for cursive
words. In: Proceedings of the Seventh International Workshop on
Frontiers in Handwriting Recognition, International Unipen Foundation, pp. 529–534 (2000)
3. Dawoud, A., Kamel, M.: New approach for the skeletonization of
handwritten characters in gray-scale images. In: Proceedings of the
International Conference on Document Analysis and Recognition,
pp. 1233–1237 (2003)
4. Wen, M., Fan, K., Han, C.: Classification of chinese characters
using pseudo skeleton features. J. Inf. Sci. Eng. 20, 903–922 (2004)
5. Kegl, B., Krzyzak, A.: Piecewise linear skeletonization using
Principal curves. IEEE Trans. Pattern Anal. Mach. Intell. 24(1),
59–74 (2002)
6. Ahmed, M., Ward, R.: A rotation invariant rule-based thinning
algorithm for character recognition. IEEE Trans. Pattern Anal.
Mach. Intell. 24(12), 1672–1678 (2002)
7. Zou, J.J., Yan, H.: Skeletonization of ribbon-like shapes based on
regularity and singularity analysis. IEEE Trans. Syst. Man Cybern.
B 31(3), 401–407 (2001)
8. Zou, J.J., Yan, H.: Vectorization of cartoon drawings. In:
Eades, P., Jin, J. (eds.) Selected Papers from Pan-Sydney Workshop
on Visual Information Processing, ACS, Sydney (2001)
9. Tang, Y.Y., You, X.: Skeletonization of ribbon-like shapes based
on a new wavelet function. IEEE Trans. Pattern Anal. Mach.
Intell. 25(9), 1118–1133 (2003)
10. Chiang, J.Y., Tue, S.C., Leu, Y.C.: A new algorithm for line image
vectorization. Pattern Recognit. 31(10), 1541–1549 (1998)
11. Nel, E., Du Preez, J.A., Herbst, B.M.: Estimating the pen trajectories of static signatures using hidden Markov models. IEEE Trans.
Pattern Anal. Mach. Intell. 27, 1733–1746 (2005)
12. Nel, E., Du Preez, J.A., Herbst, B.M.: Estimating the pen trajectories of static scripts using hidden Markov models. In: Proceedings
of the International Conference on Document Analysis and Recognition, pp. 41–47 (2005)
13. Nel, E., Du Preez, J.A., Herbst, B.M.: Verification of dynamic
curves extracted from static handwritten scripts. Pattern Recognit. 14, 3773–3785 (2008)
14. Pan, J.C., Lee, S.: Offline tracing and representation of signatures.
In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 679–680 (1991)
15. Lee, S., Pan, J.C.: Offline tracing and representation of signatures. IEEE Trans. Syst. Man Cybernet. 22(4), 755–771 (1992)
16. Lallican, P.M., Viard-Gaudin, C.: A Kalman approach for stroke
order recovering from off-line handwriting. In: Proceedings of the
International Conference on Document Analysis and Recognition,
IEEE Computer Society, pp. 519–523 (1997)
17. Chang, H., Yan, H.: Analysis of stroke structures of handwritten chinese characters. IEEE Trans. Syst. Man Cybernet. B 29(1),
47–61 (1999)
18. Kato, Y., Yasuhara, M.: Recovery of drawing order from single-stroke handwriting images. IEEE Trans. Pattern Anal. Mach.
Intell. 22(9), 938–949 (2000)
19. Guo, J.K., Doerman, D., Rosenfeld, A.: Forgery detection by
local correspondence. Int. J. Pattern Recognit. Artif. Intell. 15(4),
579–641 (2001)
20. Plamondon, R., Maarse, F.J.: An evaluation of motor models of handwriting. IEEE Trans. Syst. Man Cybernet. B 19(5),
1060–1072 (1989)
21. Boccignone, G., Chianese, A., Cordella, L.P., Marcelli, A.: Recovering dynamic information from static handwriting. Pattern Recognit. 26(3), 409–418 (1993)
22. Jäger, S.: A psychomotor method for tracking handwriting. In:
Proceedings of the International Conference on Document Analysis and Recognition, IEEE Computer Society, pp. 528–531
(1997)
23. Govindaraju, V., Srihari, S.: Separating handwritten text from overlapping non-textual contours. In: International Workshop on Frontiers in Handwriting Recognition, pp. 111–119 (1991)
24. Doermann, D., Rosenfeld, A.: Recovery of temporal information
from static images of handwriting. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 162–168 (1992)
25. Doermann, D.S., Rosenfeld, A.: Recovery of temporal information from static images of handwriting. Int. J. Comput. Vis. 15,
143–164 (1995)
26. Doermann, D., Rosenfeld, A.: The interpretation and reconstruction of interfering strokes. In: Frontiers in Handwriting Recognition, pp. 41–50 (1993)
27. Lam, L., Lee, S., Suen, C.Y.: Thinning methodologies-a
comprehensive survey. IEEE Trans. Pattern Anal. Mach.
Intell. 14(9), 869–885 (1992)
28. Gonzalez, R.C., Woods, R.E.: Digital Image Processing. AddisonWesley, Reading (1992)
29. Seul, M., O’Gorman, L., Sammon, M.S.: Practical Algorithms for
Image Analysis: Description, Examples, and Code. Cambridge
University Press, Cambridge (2000)
123
E.-M. Nel et al.
30. Gonzalez, R.C., Woods, R.E., Eddins, S.L.: Digital Image Processing using Matlab. Pearson Prentice Hall, New York (2004)
31. Verwer, B., van Vliet, L., Verbeek, P.: Binary and grey-value
skeletons: Metrics and algorithms. Int. J. Pattern Recognit. Artif.
Intell. 7(5), 1287–1308 (1993)
32. Lam, L., Suen, C.Y.: Automatic comparison of skeletons by shape
matching methods. Int. J. Pattern Recognit. Artif. Intell. 7(5),
1271–1286 (1993)
33. Fan, K., Chen, D., Wen, M.: Skeletonization of binary images with
nonuniform width via block decomposition and contour vector
matching. Pattern Recognit. 31(7), 823–838 (1998)
34. Rocha, J.: Perceptually stable regions for arbitrary polygons. IEEE
Trans. Syst. Man Cybernet. B 33(1), 165–171 (2003)
35. Plamondon, R., Suen, C., Bourdeau, M., Barriere, C.: Methodologies for evaluating thinning algorithms for character recognition. Int. J. Pattern Recognit. Artif. Intell. 7, 1247–1270 (1993)
36. De Berg, M., Van Kreveld, M., Overmars, M., Schwarzkopf, O.:
Computational Geometry Algorithms and Applications, 2nd edn
Springer, Berlin (1997)
37. Martinez, T., Schulten, K.: Topology representing networks. Neural Netw. 7(3), 507–522 (1994)
38. Sedgewick, R.: Algorithms. Addison-Wesley, Reading (1946)
39. Dolfing, J.G.A.: Handwriting recognition and verification: a hidden
Markov approach. Ph.D. thesis, Eindhoven, Netherlands (1998)
40. Van Oosterhout, J.J.G.M., Dolfing, J.G.A., Aarts, E.H.L.: Online signature verification with hidden Markov models. In: Proceedings of the International Conference on Pattern Recognition,
pp. 1309–1312 (1998)
123
41. Coetzer, J.: Off-line signature verification, Ph.D. thesis, Stellenbosch University Press (2005)
42. Tukey, J.W.: Exploratory Data Analysis. pp. 46–47. AddisonWesley, Reading (1977)
43. Chaikin, G.: An algorithm for high speed curve generation. Comput. Vis. Gr. Image Process. 3, 346–349 (1974)
44. Lane, J.M., Riesenfeld, R.F.: A theoretical development for the
computer generation of piecewise polynomial surfaces. IEEE
Trans. Pattern Anal. Mach. Intell. 2, 34–46 (1980)
45. Nel, E., Du Preez, J.A., Herbst, B.M.: www.ussigbase.org
46. Lallican, P.M., Viard-Gaudin, C., Knerr, S., Binter, P.: The IRESTE ON-OFF (IRONOFF) handwritten image database. In: Proceedings of the International Conference on Document Analysis
and Recognition, pp. 455–458 (1999)
47. Smith, E.: Characterization of image degradation caused by scanning. Pattern Recognit. Lett. 19(13), 1191–1197 (1998)
48. Zhou, J.Y., Lopresti, D., Sarkar, P., Nagy, G.: Spatial Sampling Effects on Scanned 2-D Patterns. World Scientific, Singapore (1997)
49. Smith, E.H.B.: Scanner parameter estimation using bilevel scans
of star charts. In: Proceedings of the International Conference on
Document Analysis and Recognition, pp. 1164–1168 (2001)
50. Deller, J.R., Hansen, J.H.L., Proakis, J.G.: Discrete-Time Processing of Speech Signals. IEEE Press, New York (2000)
51. Rabiner, L.R., Juang, B.H.: An introduction to hidden Markov
models. IEEE ASSP Magazine pp. 4–16 (1986)