Review Article |
Corresponding author: Klaus Riede ( k.riede@zfmk.de ) Academic editor: Diptarup Nandi
© 2018 Klaus Riede.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Riede K (2018) Acoustic profiling of Orthoptera: present state and future needs. Journal of Orthoptera Research 27(2): 203-215. https://doi.org/10.3897/jor.27.23700
|
Bioacoustic monitoring and classification of animal communication signals has developed into a powerful tool for measuring and monitoring species diversity within complex communities and habitats. The high number of stridulating species among Orthoptera allows their detection and classification in a non-invasive and economic way, particularly in habitats where visual observations are difficult or even impossible, such as tropical rainforests. Major sound archives were queried for Orthoptera songs, with special emphasis on usability as reference training libraries for computer algorithms. Orthoptera songs are highly stereotyped, reliable taxonomic features. However, exploitation of songs for acoustic profiling is limited by the small number of reference recordings: existing song libraries represent only about 1000 species, mainly from Europe and North America, covering less than 10% of extant stridulating Orthoptera species. Available databases are fragmented and lack tools for song annotation and efficient feature-based searching. Results from recent bioacoustic surveys illustrate the potential of the method, but also the challenges and bottlenecks impeding further progress. A major problem is time-consuming data analysis of recordings. Computer-aided identification software exists for classification and identification of cricket and grasshopper songs, but these tools are still far from practical for field application.
A framework for acoustic profiling of Orthoptera should consist of the following components: (1) Protocols for standardized acoustic sampling, at species and community levels, using acoustic data loggers for autonomous long-term recordings; (2) Open access to and efficient management of song data and voucher specimens, involving the Orthoptera Species File (OSF) and Global Biodiversity Information Facility (GBIF); (3) An infrastructure for automatized analysis and song classification; and (4) Complementation and improvement of Orthoptera sound libraries using OSF as the taxonomic backbone and repository for representative song recordings. Taxonomists should be encouraged, or even obliged, to deposit original recordings, particularly if they form part of species descriptions or revisions.
acoustic monitoring, data repositories, Orthoptera Species File, sound libraries, standardization
A considerable number of animal species produce species-specific sounds for communication, indicating their presence acoustically. Among the most impressive examples are tropical rainforest insects, producing a huge variety of audible signals, while only very few can actually be seen (
There is a long tradition in ornithology of identifying birds by their songs (
Due to their small size, high species diversity, strong population fluctuations, and cryptic lifestyles, insects are particularly difficult to monitor, requiring expensive and frequent sampling of specimens (
Besides species discovery, the potential of acoustic monitoring for Environmental Impact Assessments and Red Listing of Orthoptera is evident.
Several bioacoustic monitoring studies focusing on Orthoptera applied (semi-)automatic identification (
Comprehensive song libraries are paramount for acoustic profiling of entire communities, either machine-based or relying on human expertise. At present, there is not even a simple identification tool for unknown Orthoptera songs. The vision is to upload a sound recording to a data warehouse portal and search for similar acoustic patterns, comparable to the Basic Local Alignment Search Tool (BLAST,
This article explores several acoustic archives and their pros and cons as a possible repository for song reference recordings, based on data-mining of existing online sound repositories for Orthoptera songs. By analyzing the lessons learnt, I present a strategic framework for establishing acoustic profiling as a core element of future automatized monitoring schemes, targeting all vocalizing animals within entire soundscapes.
The analysis of insect sounds started with simple, descriptive verbal descriptions and musical annotation, pioneered by
It is now widely demonstrated that most Orthoptera songs are inborn, stereotyped and species-specific, providing reliable taxonomic features. Most species exhibit a maximum of only three distinct song types: calling, courtship, and rival song, depending on the behavioral context. Striking differences in calling song structure of morphologically similar species helped taxonomists to diagnose and describe “cryptic species”, many of which cannot be determined without a sound recording. In a seminal paper,
Regional faunistic surveys including songs were pioneered by
Original recordings are available for only a small fraction of these pioneer studies, as analog tapes or on CD (see below). In any case, most authors published basic song parameters and graphic representations revealing frequency composition (spectrograms and power spectra) and temporal structure (oscillograms, cf. Fig.
Web-based sound analysis tool for the Macaulay Sound Library, Cornell Lab (https://www.macaulaylibrary.org). Macaulay Library provides more than 400,000 playable audio files (http://macaulaylibrary.org/index.do), and even permits spectrographic online visualization using RavenViewer as a free browser plugin (http://www.birds.cornell.edu/brp/software/sound-analysis-tools). The example shows a recording of a Virtuoso katydid by T. Walker, who provided most of the Orthoptera sound recordings for this sound library. For further details, see text.
An adequate analysis of Orthoptera songs cannot be achieved by the unaided human ear but requires visualization and temporal analysis by signal analysis software. A wide variety of programs is now available for personal computers (for an extensive list see
Particularly for tropical Orthoptera, reliable species identification is only possible by determination of a collected voucher specimen, which often turns out to be an unknown species in need of taxonomic description. Therefore, most tropical Orthoptera are caught and recorded in captivity, to establish a reliable cross-reference between voucher specimen and recording. Besides essential parameters like time, recordist, etc. (cf. Table II in
Older recordings and state of digitization.—The history of analog recordings starts in 1889, and major archives of wildlife recordings go back to the 1940s (for a historical synopsis see
Microphones and recording apparatuses varied widely due to considerable technological changes during the last decades, evolving from analog tape recorders to digital recording. The frequency spectrum of many Orthoptera reaches far into the ultrasound, with the recently described, hitherto highest-pitched katydids of the Neotropical genus Supersonus reaching up to 150 kHz (
Since the 1990s, most monographs compiling Orthoptera songs were backed up by recordings on CD, serving as potential acoustic determination guides and targeting a wider audience. Compilations are available for most European (
Several well-organized sound libraries house more than hundreds of thousands of catalogued analog tape recordings of vocalizing animals, such as the Tierstimmenarchiv Berlin (http://www.tierstimmenarchiv.de/), British Library Sound Archive's wildlife collection (https://www.bl.uk/collection-guides/wildlife-and-environmental-sounds), or the Macaulay Library of Sounds (
Besides the major sound archives reviewed below, there are important regional archives (reviewed for Latin America by Ranft 2004 in Annex II) and new initiatives such as the sound library of the Museum National d'Histoire Naturelle (La sonothèque: https://sonotheque.mnhn.fr/). A list of links to major sound libraries is provided by the International Bioacoustic Council (
Digital availability of sound recordings.—Digitization of existing analog recordings in most major sound archives is under way, but there are distinct policies on use and access via the World Wide Web (
The following comparison of major sound archives focuses on the number of accessible Orthoptera songs, number of species, and taxonomic compatibility with the Orthoptera Species File (OSF;
Digitized Orthoptera songs in major sound archives and databases. For further details on issues and special features see text.
Archive 1 | N Orthoptera recordings | N taxa | Taxa | Geographic focus Orthoptera fauna | Issues and special features |
Macaulay Cornell Lab | 9,282 | 2622 | All animals; Ensifera | North America | + Raven viewer for sound visualization + Basket function for download, annotations (+) GBIF federation with issues – no voucher cross–reference – Temperature missing or comment only |
SYSTAX–DORSA | 8,6693 | 550 | Orthoptera | Europe (Ecuador, South East Asia)4 | + Additional user interfaces via Europeana (+) GBIF federation with issues + Additional user interfaces – uploads difficult; completed archive – Temperature in commentary |
Tierstimmen–archive | 1,093 | 66 | All animals; Orthoptera | World–wide, mainly Europe | + Full GBIF federation – no voucher cross–reference – temperature missing or hidden in text |
BioAcoustica5 | 2,358 | 556 | Orthoptera | World–wide, mainly Europe | + Graphic display of standard sound analysis + Rapidly growing, allowing user uploads (+) GBIF federation with issues |
Borror Sound Archive | 1,761 | 119 | All animals; Orthoptera | North America, Australia | + full GBIF federation |
SINA | n.a. | (440)6 | Ensifera | North America | + Species fact sheets with sonagrams and songs for download + full tables of song parameters for download7 + cross–reference to voucher – no database query interface – GBIF |
Orthoptera Species File | n.a. | 7768 | Orthoptera | World–wide | + well–curated, up–to–date taxonomic backbone (+) providing links to additional resources – temperature hidden in commentary |
GBIF 9 | 4,803 | 119 | Orthoptera | World–wide | – double–entries of specimens from distinct data providers but identical primary source |
While all databases allow extraction of the number of Orthoptera recordings, information about the number of species was not always available. Therefore, it was queried from a table downloaded from
Major sound libraries focus on vertebrates, particularly birds, containing few insect recordings. In contrast, SINA (
The SYSTAX database. Screenshot of the new SYSTAX user interface, to be released under www.systax.org. A search for the Neotropical tettigoniid genus Anaulacomera recovers several sound recordings from a voucher specimen of a hitherto undescribed species, documented by photographs. Faceting allows searching by images or sounds exclusively.
In summary, accessibility of Orthoptera song recordings in any format is extremely limited. With a total of 26,000 described Orthoptera species of which a (conservatively!) estimated 10,000 are able to stridulate, we have web access to song recordings for about 1000 species, i.e. coverage of a meagre 10% of all stridulating Orthoptera species. Adding another 1000 songs scattered in publications, CDs, books and private collections, we might have song data for about 2000 species, which is still only 20% of all known stridulating species. If we assume that another 20,000 Orthoptera species still remain to be described (again, a conservative estimate, cf.
Accessibility of new digital recordings.—The amount of multimedia data documenting animal songs is growing exponentially thanks to Passive Acoustic Monitoring (PAM) and citizen science efforts (cf.
Traditional targeted song recordings of individual Orthoptera species have now been complemented by acoustic profiling using entire soundscapes (sensu
Soundscape projects generally do not even try to identify or assess species compositions, but rather measure overall indices.
The generated data volume is huge, and in most cases not publicly accessible or, as is the case for microphone arrays, not stored at all. Terabytes of acoustic recordings are stored on researchers’ hard disks, with a high risk of getting lost, thereby impeding the chance for re-analysis. Only a small number of projects maintain servers to release soundscape recordings for re-analysis. Maintenance and release of soundscape data will provide opportunities and future challenges, as well as valuable data sources for orthopterists, because most PAM recordings from rainforests are dominated by insects, and Orthoptera in particular (
Improving data coverage.—The number of species covered by each database presented in Table
For most species with a SYSTAX-DORSA recording, a “typical” song has already been transferred to OSF, which presently contains songs for 818 species and subspecies (
Problems of data sharing and file exchange.—At present, federated bioacoustics datasets downloaded from GBIF have issues resulting from unresolved problems between data providers and GBIF. Macaulay (
In addition, downloading sound files from currently available repositories leads to disintegration of sound file and sound metadata. The safest way to avoid such disintegration is to store metadata within the sound file – typically, a spoken announcement by the recordist often contains information about time, place, temperature, microphone, and recording conditions. However, if this information is clipped for the sake of signal clarity and detectability, a downloaded sound file cannot be attributed to its source and metadata. For SYSTAX-DORSA sound files, the Soundminer software (http://store.soundminer.com/) was used to annotate metadata, showing species name and source when displayed on most devices (Fig.
Embedding metadata within sound files. Metadata were embedded within wav and mp3 fields directly from the SYSTAX database using Soundminer software (http://store.soundminer.com/). Metadata are visible within most mp3-players, displaying the species name as “TrackTitle” and the recordist as “Artist” (Courtesy: S. Ingrisch).
Embedding metadata within the sound file creates redundancy which can be used to restore or cross-check the links between the original database storing the metadata and the multimedia object.
A combination of features from all databases reviewed here probably describes best the requirements for an ideal Orthoptera song data warehouse. In particular:
· Baseline collection data such as recordist, time, and locality.
· Cross-reference to voucher specimen, if available: repository (e.g. museum collection), unique identifier (collection number), identifier, and baseline data.
· If no voucher specimen is available, an image, video and comments on taxonomic reliability by naming the identifier.
· Comprehensive metadata for each recording, in particular temperature, microphone with frequency characteristics and distance from specimen, and preferably sound intensity at a given distance.
· User-friendly upload and query interface for input.
On the output side users need:
· Advanced search functions.
· Basket function for download of selected songs and/or corpora, including metadata.
Optional requirements include online visualization of sound files (spectrogram/oscillogram), generation of bioacoustic factsheets, and flexible tools for annotation of song parameters.
Building on these basic features, a bioacoustics workbench could provide efficient, reciprocal connection to taxonomic (OSF) and specimen-based federated specimen databases (GBIF).
None of the existing databases fulfill all these requirements. Therefore, the way forward is interoperability and the federation of existing multimedia databases. Commercial or community multimedia providers like the pioneering peer-to-peer filesharing program Napster (https://en.wikipedia.org/wiki/Napster), iTunes, or SoundCloud (https://soundcloud.com/) demonstrate that efficient, user-friendly data management and federation of sound files is feasible, but not designed for scientific use, requiring annotation, citability and sustainability of repositories. GBIF federates specimen data. It allows filtering for audio data, providing multimedia links, but without any interface for direct listening or bulk download via shopping basket functions. However, GBIF is evolving rapidly and is attentive to users’ needs. Among the existing sound libraries, BioAcoustica (http://bio.acousti.ca/,
A scheme illustrating elements and workflows of a bioacoustics data warehouse is presented in Fig.
A data warehouse for sound management. The scheme illustrates elements and workflow for acoustic profiling of Orthoptera. Songs are sampled either by recording individual songsters (Targeted Recordings), or entire acoustic scenes, each of which could contain several Orthoptera songs. Targeted recordings are treated like specimens, with time and locality stamps and, preferably, a voucher specimen. All databases listed in Table
A well-designed data warehouse infrastructure is the only way to organize efficient workflows between taxonomists (providing reference sound libraries) and computer scientists developing algorithmic recognition tools. Ideally, code and documentation of recognizer software should be publicly accessible through the (virtual) data warehouse, together with the sound libraries and references to voucher specimens. For the time being, it is suggested to establish OSF as a taxonomic backbone to host at least one song recording per species, which would allow for verifying completeness of bioacoustic coverage of singing Orthoptera species. Every sound file could be associated with a unique Life Science Identifier (LSID), comparable to Digital Object Identifiers (DOI), facilitating the necessary cross reference between names, multimedia files, voucher specimens, and eventually genetic sequences. However, at present, a functional LSID architecture is jeopardized by lack of standards (cf. Table
Well-documented, comprehensive song libraries are the prerequisite for the next logical step, which is acoustic profiling of entire communities. This is particularly promising for lesser known tropical faunas, where acoustic recording could accelerate species assessment. Up to now, overall analysis of Orthoptera communities based on entire soundscapes are still limited to very few sites.
There is a fundamental difference between: 1) automatic classification and identification of individual recordings, consisting of high-quality sound signals of an unknown Orthoptera songster, or; 2) recognition of Orthoptera songs “hidden” within overall soundscape recordings. The two problems are quite distinct, and the latter requires additional, complex processing steps. Therefore, they are discussed separately in the following sections.
Classification of individual recordings.—For individual recordings, song parameters such as pulse rate and carrier frequency can be easily extracted by basic sound analysis software. These parameters might be sufficient to identify species using a traditional taxonomic key (
More complex software for Orthoptera song identification is based on Artificial Neural Networks (ANN) and Hidden Markov Models (HMM) which are widely used in automatic human speech recognition (
Data-mining soundscapes.—Identification of individual species in soundscapes is a much harder task because of noise and highly variable microphone distances from songsters. As a first step, Regions of Interest (ROIs) – sound signals probably containing a song – have to be identified and filtered. In a second step, these ROIs can eventually be treated and classified like individual recordings. A considerable number of publications report successful algorithmic identification of sets of bat (
Most recognition software was developed for birds, based on extensive corpora of overall soundscape recordings and high numbers of individual, labelled species recordings used for training and testing.
To facilitate multiple use of sound files for improving algorithms, the respective sound files should be tagged and labelled as a corpus. A wide variety of well-documented corpora is available to be used in computational linguistics and speech recognition. A speech corpus is a well-defined set of speech audio files (
“It must be clear at this point that those systematists who utilize communicative signals and isolating mechanisms as their principal means of locating and recognizing species are not simply studying biology as well as morphology, or simply using a wide variety of characters, as is commonly and justifiably considered desirable in bio-systematic work. Their entire approach, their methods of analysis, and their interpretations of particular kinds of data are all different. Further, and probably most important, their possibilities for rapid and accurate systematic coverage are unparalleled. For this reason, the groups of animals for which these techniques are possible ought to present unique opportunities for breakthroughs in biogeography and in the study of speciation and other evolutionary phenomena.” (l.c., p. 5). Three decades later, bioacoustic characters of Orthoptera songs frequently form part of species descriptions, taxonomic revisions (e.g. Anatolian Chorthippus species:
To mobilize the full potential of sound repositories for biodiversity research, innovative query tools are needed. The vision is to upload a sound recording to a data warehouse portal and search for similar acoustic patterns, comparable to BLAST (
Thanks to the rapid technological evolution of hard- and software, complex Artificial Intelligence tools for recognition of human speech, music and animal sounds are now available for personal devices such as smartphones. Commercial programs and apps such as Shazam (for music recognition: https://play.google.com/store/apps/details?id=com.shazam.androidandhl=en_US) or Alexa (for human speech recognition https://play.google.com/store/apps/details?id=com.amazon.dee.appandhl=de) are well-known examples. Evidently, speech recognition is of considerable military and economic interest, which means that large parts of on-going research are not accessible to the research community. This might be the reason that animal sound recognition lags far behind the performance of the above-mentioned commercial products.
PAM in combination with computer-aided algorithms could lead to major progress in species monitoring and discovery.
While there has been considerable progress in bird song recognition and the labelling of large audio datasets, a comparable milestone has not yet been reached for Orthoptera. This is probably due to insufficient coverage in sound libraries. Orthoptera songs have been documented for less than 20% of described species. Reference sound libraries are missing not only for tropical regions, but are incomplete even for well-known faunas, e.g. North American grasshoppers, despite comprehensive literature including detailed description of communication and spectrograms of songs (
It must be doubted that self-organizing scientific routine procedures will suffice to establish the necessary infrastructure sketched here. A strong commitment for data sharing as part of good scientific practice is needed, preferably under the leadership of the respective scientific societies such as IBAC or the Orthopterists’ Society, together with representatives from major sound archives. The OSF (
Despite promising first results, an efficient connection and data flow between sound archives, museum collections, advanced computational tools and users has not yet been established. Close cooperation of biologists with computer engineers is needed to cope with the data deluge generated by PAM. Again, well-curated and documented song libraries are a prerequisite to exploit bioacoustic Big Data for further biodiversity assessments.
Basically, an efficient acoustic sampling strategy should consist of the following components:
1. Protocols for standardized acoustic recording, at species and community level, using acoustic data loggers for autonomous long-term recordings.
2. Open access to and efficient management of sound recordings, song data, and voucher specimens, involving the Orthoptera Species File (OSF:
3. An infrastructure for automatic analysis and song classification for on-the ground and web-based analysis, including web2.0 applications for user communities and citizen science.
4. A strategic framework for future inventorying and monitoring efforts, including geographic priorities.
Components 1 and 3 involve the entire terrestrial bioacoustics research community, requiring considerable effort to overcome fragmentation between distinct bioacoustic subgroups, clustering around distinct taxa (e.g. frogs, birds, etc.). In contrast, 2 and 4 focus on Orthoptera and are feasible, eventually serving as a model for other species groups.
A recent commentary paper by
This article is based on numerous fruitful discussions with my colleagues, in particular, Maria Marta Cigliano, Holger Braun and Hernán Lucas Pereira, Museo de La Plata, La Plata, Argentina. Thanks to Ed Baker for linguistic corrections and very useful suggestions. Special thanks to reviewer Rittik Deb, who made very constructive suggestions for re-structuring the article and deepening the sections on algorithmic species recognition.