In lieu of an abstract, here is a brief excerpt of the content:

To identify mound pottery samples with a similar composition of decorated type frequencies and place them into ceramic phases in a statistically valid manner, we conducted cluster analyses of the 21 mound pottery samples analyzed in our frequency seriation of ceramic types (see Fox 1998; Kreisa 1998; Mainfort 1999). Cluster analysis is a set of multivariate statistical techniques used to group objects together based on their similarities. In a cluster analysis, mound provenience units that yield similar frequency percentages of the seven ceramic types are grouped into the same cluster. The resulting clusters generated by this technique show high within-cluster resemblance and low between-cluster similarity (Hair and Black 2000:147). Two basic forms of clustering, hierarchical and nonhierarchical, can be employed . Cluster Analysis In using the hierarchical method, the researcher has no predetermined number of clusters in mind to group the data. The hierarchical cluster technique begins by forming single-member clusters in which each site assemblage is uniquely a member of its own cluster. Then, in a step-wise manner, all single-member clusters are combined into larger and larger clusters until a single large cluster encompasses all of the cases. Similarities between clusters are measured in Euclidean distance between plotted points on a graph, and larger clusters are formed by the combination of smaller existing clusters. This technique can be summarized visually as a dendrogram forming a treelike graph with the twigs and branches of the tree forming the smallest clusAppendix D Seriation Methods ters with the greatest resemblance to each other, and the larger branches and trunk of the tree forming fewer groups with a greater range of distance between assemblages. Often this technique is used ¤rst to identify and quantify clusters in the data set. The second type of cluster analysis, nonhierarchical or K-means cluster analysis, requires the researcher to specify how may clusters are to be generated . The analysis then proceeds to group the total number of cases into the speci¤ed number of clusters based on the goodness of ¤t determined by ceramic type frequency similarity. Analysis of variance (ANOVA) tests are run for each given cluster solution to determine the strength of the statistical ¤t for each cluster solution requested. Researchers must use their judgment to determine which cluster solution best ¤ts the data set. To ¤nd the best cluster solution, three checks are applied to the statistical clusters results (Hair and Black 2000:180–185). First, a check is made to ensure that cluster sizes are roughly equal with no small clusters suggestive of outliers that might not be representative of the larger population. Outliers are deleted from the analysis and the analysis is performed again. Once the cluster analysis produces clusters of roughly equal size, a second check is made for the cluster solution with the highest statistical signi¤cance of all the variable means compared in the ANOVA statistic. In some cluster solutions, the goodness of ¤t for some variables is not as strong as other cluster solutions. The ¤nal check is called the stopping rule. One kind of stopping rule involves a comparison of the similarity of measured distance between cases within a cluster, with a stress scale ranging from 0 (indicating a perfect ¤t) to 1 (indicating the worst ¤t). When fewer clusters are requested to group the entire data set, more dissimilar data sets are combined to form larger clusters, thus producing higher stress values between cases within a given cluster. When the stress values markedly increase from those values in the preceding higher number cluster solution, then that higher number cluster solution before the increase is judged to be the best cluster solution (Hair and Black 2000:184). We employed all of the aforementioned methods of cluster analysis on the 21 strati¤ed mound proveniences in the frequency seriation to further organize the data into statistically valid groupings of similar ceramic assemblages or ceramic phases. In order to avoid unrepresentative sample sizes for each sample mound, only the 21 proveniences with 50 or more decorated potsherds were used in the analysis. Results of the Hierarchical Cluster Analysis We used the statistical program SYSTAT (Wilkinson et al. 1992) to generate a dendrogram of cluster patterns (Figure D.1). The dendrogram revealed 254 / appendix d that the 21 proveniences formed six subcluster groupings based on similarities of ceramic frequency percentages of the seven decorated types. Overall , the dendrogram produced two large clusters with smaller nested subclusters . The ¤rst large cluster contains a subcluster of Early...

Share