Publication Cover
Australian Journal of Earth Sciences
An International Geoscience Journal of the Geological Society of Australia
Volume 54, 2007 - Issue 4
158
Views
12
CrossRef citations to date
0
Altmetric
Original Articles

Clustering of cumulative grainsize distribution curves for shallow-marine samples with software program CLARAFootnote*

Pages 503-519 | Received 12 Aug 2005, Accepted 23 Aug 2006, Published online: 18 Jun 2007
 

Abstract

Clustering of cumulative grainsize distribution curves was trialled with the publicly available software program CLARA as a means of finding sediment samples and geographical areas or parts of the geological record with highly similar particle-size characteristics. CLARA proved effective for this purpose. Tests were made with large datasets from the shallow-marine environments of Sydney Harbour (Australia), Oronsay (Inner Hebrides, Scotland), and Darss Sill (Baltic Sea). CLARA has four possible configurations depending on choices of distance metric and standardisation. One configuration identified outliers and small groups of samples most dissimilar from others, a very useful function. A second configuration clustered cumulative curves in a geometrical fashion similar to manual clustering. Compared to CLARA, an entropy algorithm was several orders of magnitude slower and did not identify outliers. When smaller numbers of clusters were requested, cumulative curves with strongly opposite curvature were grouped by the entropy algorithm, and by CLARA for non-standardised (but not standardised) variables. This potential problem is removed for CLARA by initially forming more clusters than suggested by statistical methods, in conjunction with outlier detection and removal. Entropy and CLARA clustering of frequency distributions provided the best resolution of size modes, and best separation of overlapping size modes, but clustering of cumulative curves provided better overall groupings. However, CLARA is not suitable for direct clustering of all conceivable frequency distributions, a problem not occurring with the corresponding cumulative curves.

Acknowledgements

I thank Carme Hervada I Sala for providing the Darss Sill dataset. Stuart Anstee, Adrian Baddeley, Brendan Brooke, Andrew Heap and Alan Orpin made suggestions which improved the original manuscript. Alan Orpin informed me of the Australian references to entropy clustering of frequency distributions.

Notes

*Figures 10 – 12 [indicated by an asterisk (*) in the text and listed at the end of the paper] are Supplementary Papers; copies may be obtained from the Geological Society of Australia's website (www.gsa.org.au) or from the National Library of Australia's Pandora archive (http://nla.gov.au/nla.arc-25194).

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 487.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.