In lieu of an abstract, here is a brief excerpt of the content:

COMPLEXITIES O F STRUCTURE: Th e Shapes Of Greups However, a s we think about the possible structures that data may take up, it is clear that th e matte r ma y be far mor e complex. Fo r instance, the momen t tha t ou r dat a are i n more tha n tw o or three dimensions , the moment , i n other words, that mor e than three variables describe each point in our data, we have difficulty i n recognising what th e dat a show . An d th e momen t tha t th e data take u p forms i n which group s may be o f shapes other than spherical o r elliptical (hyper-spherica l o r hyper-elipti cal , a s the case may be), the moment tha t groups have interfaces wit h one anothe r that are more than simple separations, even our statistical methods may be unable to "see" exactly what is present. Figure 4: Tw o dimensiona l artificia l dat a showin g (top ) tw o roughl y circula r groups, (middle ) a singl e dumb-bel l shape d grou p an d (bottom ) tw o linear parallel groups (after Dud a an d Hart , 1973) . BEYOND BIOMETR Y Thus figure 4 shows three graphs in which groups of different shap e exist. The upper fram e demonstrate s th e simpl e situatio n — tw o approximatel y equa l roun d groups; the secon d fram e show s fairly clearl y (a t leas t to the ey e — these dat a ar e two-dimensional) a single dumbbell-shaped group ; the third frame depict s two long groups lyin g side b y side. Figur e 5 suggests, however , tha t ou r statistica l method s may "find " ver y differen t structure s o n occasion . Fo r thoug h th e tw o equal roun d groups ar e easil y recognise d b y a standar d clusterin g procedure , th e dumbbell shaped grou p tends to be viewed as two separate roun d groups. Much the worst result , however, i s that the two parallel linea r groups are completely unrecognised — the procedure identifyin g instea d tw o approximately roun d groups . Figure 5: Th e applicatio n t o th e dat a o f figur e 4 of a furthes t neighbou r group finding algorithm . Th e uppe r fram e show s tha t tw o circula r group s ar e readily found . Th e middl e fram e show s tha t a single dumb-bel l shape d group is "discovered" to be two approximately equal circular groups. The bottom fram e demonstrate s tha t th e tw o linea r paralle l group s ar e "found" t o be two quite different unequa l approximately circular groups. Other cluster finding method s perform differentl y (afte r Dud a and Hart , 1973). [3.22.51.241] Project MUSE (2024-04-20 00:44 GMT) BEYOND BIOMETR Y A second example is provided by the data in the upper frame of figure 6. Study of these data using the eye (possible, again, because the example is two-dimensional) is equivocal. D o we see a single group here that is arranged i n such a way that it has a small ver y dens e centr e an d a larg e ver y diffus e periphery ? O r d o we see a single group, tight and dense, superimposed upon a diffuse background ? Or do we see, yet again, two groups, a small dense group lying upon a large diffuse group ? Figure 6 also show s (lowe r frame ) tha t a particular cluste r findin g procedur e identifie s ye t a fourth possibilit y not readily occurring to the human mind — an arrangement of the data suggesting a complex star-like structure. Who is to say which of these views is correct? Figure 6: Tw o dimensiona l artificia l dat a (uppe r frame ) an d a furthest neighbou r cluster finding procedure applied to them (lower frame — after Duda and Hart, 1973) . ...

Share