Acoustic profiling of Orthoptera : present state and future needs

Bioacoustic monitoring and classification of animal communication signals has developed into a powerful tool for measuring and monitoring species diversity within complex communities and habitats. The high number of stridulating species among Orthoptera allows their detection and classification in a non-invasive and economic way, particularly in habitats where visual observations are difficult or even impossible, such as tropical rainforests. Major sound archives were queried for Orthoptera songs, with special emphasis on usability as reference training libraries for computer algorithms. Orthoptera songs are highly stereotyped, reliable taxonomic features. However, exploitation of songs for acoustic profiling is limited by the small number of reference recordings: existing song libraries represent only about 1000 species, mainly from Europe and North America, covering less than 10% of extant stridulating Orthoptera species. Available databases are fragmented and lack tools for song annotation and efficient feature-based searching. Results from recent bioacoustic surveys illustrate the potential of the method, but also the challenges and bottlenecks impeding further progress. A major problem is time-consuming data analysis of recordings. Computer-aided identification software exists for classification and identification of cricket and grasshopper songs, but these tools are still far from practical for field application. A framework for acoustic profiling of Orthoptera should consist of the following components: (1) Protocols for standardized acoustic sampling, at species and community levels, using acoustic data loggers for autonomous long-term recordings; (2) Open access to and efficient management of song data and voucher specimens, involving the Orthoptera Species File (OSF) and Global Biodiversity Information Facility (GBIF); (3) An infrastructure for automatized analysis and song classification; and (4) Complementation and improvement of Orthoptera sound libraries using OSF as the taxonomic backbone and repository for representative song recordings. Taxonomists should be encouraged, or even obliged, to deposit original recordings, particularly if they form part of species descriptions or revisions.


Introduction
A considerable number of animal species produce species-specific sounds for communication, indicating their presence acousti-cally.Among the most impressive examples are tropical rainforest insects, producing a huge variety of audible signals, while only very few can actually be seen (Riede 1993).
There is a long tradition in ornithology of identifying birds by their songs (Parker 1991).Acoustic assessment forms part of regular censusing (reviewed by Brandes 2008), or targeted searches for flagship species such as the Ivory Woodpecker (Swiston and Mennill 2009).Efficiency and reproducibility of human observers can be increased considerably by using powerful directional microphones in combination with cheap portable sound recording devices and bat detectors, allowing monitoring of high frequency or even ultrasound signals (reviewed by Obrist et al. 2010, p. 79).Several research groups developed sophisticated autonomous sound recording and automated classification techniques, facilitating monitoring and inventorying of birds (Haselmayer andQuinn 2000, Celis-Murillo et al. 2009; but see Hutto and Stutzman 2009, for a discussion of limitations), whales (Širović et al. 2009), bats (Jennings et al. 2008), frogs (Hu et al. 2009), crickets (Riede 1993, Nischk and Riede 2001, Riede et al. 2006), bushcrickets (Penone et al. 2013) and grasshoppers (Chesmore andOhya 2004, Gardiner et al. 2005).
Due to their small size, high species diversity, strong population fluctuations, and cryptic lifestyles, insects are particularly difficult to monitor, requiring expensive and frequent sampling of specimens (Gardner et al. 2008).The species-specific songs of Orthoptera enable detectability by acoustic monitoring.With the help of adequate equipment, recordings can be used for discovery of hitherto undescribed, "new" species, detection of endemics, non-invasive mapping of species abundances and ranges (Penone et al. 2013), and rapid assessment of community structure and species turnover (Forrest 1988), particularly in complex habitats with low visibility (Riede 1993, Diwakar et al. 2007, Schmidt et al. 2013).At present, information on phenology, activity patterns, abundance, and community structure is only available for a very small number of insects, but is urgently needed to document potentially dramatic effects of climate change and changing land use patterns on insect communities (Garnas 2018, Maurer et al. 2018).The high number of stridulating species among the Orthoptera is both an opportunity and a challenge for compiling these highly needed datasets by acoustic profiling.
Besides species discovery, the potential of acoustic monitoring for Environmental Impact Assessments and Red Listing of Orthoptera is evident.Cordero et al. (2009) recognized and mapped the rare and endangered silver-bell cricket Oecanthus dulcisonans Gorochov, 1993 by its song.On the island of Réunion, several endemic crickets are indicator species for native forest, and acoustic monitoring was applied successfully to survey a reforestation program (Hugel 2012).The strong high-frequency components of bushcricket songs allow separation from ambient noise by highpass filtering.Due to their strong ultrasound components, Penone et al. (2013) were able to map singing specimens along roadsides in France, using ultrasound bat recorders.
Several bioacoustic monitoring studies focusing on Orthoptera applied (semi-)automatic identification (Fischer et al. 1997, Gardiner et al. 2005) illustrating the potential of the method.However, there are severe challenges impeding further progress.Among these are the lack of baseline data (Lehmann et al. 2014) for the respective region where acoustic monitoring is planned.Lists of candidate species are missing for most regions of the world, even for the comparatively well-known European fauna.Another bottleneck is the lack of well-curated song reference libraries, which will be the main topic of this paper.
Comprehensive song libraries are paramount for acoustic profiling of entire communities, either machine-based or relying on human expertise.At present, there is not even a simple identification tool for unknown Orthoptera songs.The vision is to upload a sound recording to a data warehouse portal and search for similar acoustic patterns, comparable to the Basic Local Alignment Search Tool (BLAST, Altschul et al. 1990), available as a tool in genetic databases (e.g.National Center for Biotechnology Information (NCBI), http://blast.ncbi.nlm.nih.gov/Blast.cgi).However, this requires comprehensive databases.Upload of sequence data to Gen-Bank is a pre-requisite for publication in peer-reviewed journals (see editorial policies for data sharing and submission guidelines of major journals, e.g.https://journals.plos.org/plosgenetics/s/submission-guidelines#loc-accession-numbers).As a result, we now have comprehensive repositories for gene sequences.As will be shown below, Orthoptera song libraries are far from comprehensive.An editorial policy of obligatory submission of original sound files to selected sound libraries would rapidly improve coverage of existing sound repositories, which is a necessary condition for progress of computer-aided species identification.
This article explores several acoustic archives and their pros and cons as a possible repository for song reference recordings, based on data-mining of existing online sound repositories for Orthoptera songs.By analyzing the lessons learnt, I present a strategic framework for establishing acoustic profiling as a core element of future automatized monitoring schemes, targeting all vocalizing animals within entire soundscapes.

Present knowledge of Orthoptera songs and coverage in sound repositories
The analysis of insect sounds started with simple, descriptive verbal descriptions and musical annotation, pioneered by Scudder (1868) for North American and Yersin (1854) for European grasshoppers (reviewed by Ragge and Reynolds 1998: p. 64).Faber (1953) focused on their function for intraspecific communication, with elaborate verbal transcriptions of songs and entire behavioral sequences, including optical displays.Research about female attraction -phonotaxis -elicited by these stereotyped songs has a long history, reviewed by Weber and Thorson (1989).Some crick-ets and several gomphocerine grasshopper species were used as model organisms for sophisticated neuroethological and biological experiments to unravel underlying neural circuitry (for Gryllus bimaculatus, G. campestris: Weber andThorson 1989, Schöneich et al. 2015;for Acrididae: Roemer and Marquart 1984, Helversen and Helversen 1998, Ronacher and Stumpner 1988).
It is now widely demonstrated that most Orthoptera songs are inborn, stereotyped and species-specific, providing reliable taxonomic features.Most species exhibit a maximum of only three distinct song types: calling, courtship, and rival song, depending on the behavioral context.Striking differences in calling song structure of morphologically similar species helped taxonomists to diagnose and describe "cryptic species", many of which cannot be determined without a sound recording.In a seminal paper, Walker (1964) reviewed studies on songs and taxonomy of North American Orthoptera, searching for eventual cryptic species.He concluded that "approximately one-fourth of the species of gryllids and tettigoniids of the eastern United States had never been recognized or had been wrongly synonymized." (l.c., p. 346).His discovery and description of "virtuoso katydids" (uhleri group of the genus Amblycorypha: Walker 2004a) corroborated this prediction.
Regional faunistic surveys including songs were pioneered by Pierce (1948) and Alexander (1956) for North American crickets, Otte and Alexander (1983) for Australian crickets, and Heller (1988) for European Tettigonioidea.Each of these studies provided graphic representations and comparative analysis of acoustic signatures for hundreds of species, highlighting pronounced interspecific differences in frequency composition and temporal structure.
Original recordings are available for only a small fraction of these pioneer studies, as analog tapes or on CD (see below).In any case, most authors published basic song parameters and graphic representations revealing frequency composition (spectrograms and power spectra) and temporal structure (oscillograms, cf.Fig. 1).These parameters could eventually be used as preliminary proxies in annotated repositories, and later be supplemented by song recordings.
An adequate analysis of Orthoptera songs cannot be achieved by the unaided human ear but requires visualization and temporal analysis by signal analysis software.A wide variety of programs is now available for personal computers (for an extensive list see Obrist et al. 2010), including the RavenViewer plug-in for the Firefox web-browser, allowing online analysis (Fig. 1).
Particularly for tropical Orthoptera, reliable species identification is only possible by determination of a collected voucher specimen, which often turns out to be an unknown species in need of taxonomic description.Therefore, most tropical Orthoptera are caught and recorded in captivity, to establish a reliable cross-reference between voucher specimen and recording.Besides essential parameters like time, recordist, etc. (cf. Table II in Ranft 2004), temperature must always be annotated because temporal patterns of Orthoptera songs depend considerably on temperature ("Dolbear's law": Dolbear 1897, Frings andFrings 1962).
Older recordings and state of digitization.-Thehistory of analog recordings starts in 1889, and major archives of wildlife recordings go back to the 1940s (for a historical synopsis see Ranft 2004).Targeted recording of individual specimens with directional microphones and portable (albeit heavy) tape recorders was the standard methodology during the 20 th century, resulting in impressive analog tape archives which often remained with the re-Journal of orthoptera research 2018, 27(2) searcher.There is a high risk of loss of these valuable collections due to deterioration and misplacement (Marques et al. 2014).
Microphones and recording apparatuses varied widely due to considerable technological changes during the last decades, evolving from analog tape recorders to digital recording.The frequency spectrum of many Orthoptera reaches far into the ultrasound, with the recently described, hitherto highest-pitched katydids of the Neotropical genus Supersonus reaching up to 150 kHz (Sarria-S et al. 2014).During the 20 th century, analog recording of ultrasound song components required special microphones and expensive high-speed tape recorders (see materials and methods in Morris 1980, Morris and Beier 1982, and Morris et al. 2018).Today, common digital recorders with built-in microphones and 96 kHz sampling rate cover a frequency range up to 30 kHz with sufficient quality.In addition, there is an increasing number of ultrasound recording devices and "bat detectors", reaching far into the ultrasound up to 300 kHz (see Obrist et al. 2010, p. 79), facilitating classification of tettigoniid songs in the field.
Since the 1990s, most monographs compiling Orthoptera songs were backed up by recordings on CD, serving as potential acoustic determination guides and targeting a wider audience.Compilations are available for most European (Ragge and Reynolds 1998), Italian (Fontana et al. 2002), Central European (Bellmann 1993), Australian Orthoptera (Rentz 1996), and Costa Ri-can katydids (Naskrecki 2000).A comprehensive compilation of Japanese Orthoptera songs on two CDs forms part of an illustrated guide to Orthoptera (Murai 2015).Note that a CD is already a digitized recording, usually of high quality.Due to copyright rules, most of these recordings are not publicly available.Nevertheless, they usually can be used for research purposes, analysis and feature extraction.
Several well-organized sound libraries house more than hundreds of thousands of catalogued analog tape recordings of vocalizing animals, such as the Tierstimmenarchiv Berlin (http://www.tierstimmenarchiv.de/),British Library Sound Archive's wildlife collection (https://www.bl.uk/collection-guides/wildlife-and-environmental-sounds), or the Macaulay Library of Sounds (Cornell Lab (2017) http://macaulaylibrary.org/).The latter provides more than 402,720 playable audio files, and even permits spectrographic online visualization using RavenViewer as a free browser plugin (cf.Fig. 1).With more than 40,000 animal sound recordings, the Borror Laboratory of Bioacoustics archive (http://blb.osu.edu/database/;Ohio State University) is among the smaller archives, but contains important historic Orthoptera recordings by R. Alexander and D. Borror, including the few available recordings of North American grasshoppers (Acrididae).
Besides the major sound archives reviewed below, there are important regional archives (reviewed for Latin America by Ranft Fig. 1.Web-based sound analysis tool for the Macaulay Sound Library, Cornell Lab (https://www.macaulaylibrary.org).Macaulay Library provides more than 400,000 playable audio files (http://macaulaylibrary.org/index.do), and even permits spectrographic online visualization using RavenViewer as a free browser plugin (http://www.birds.cornell.edu/brp/software/sound-analysis-tools).The example shows a recording of a Virtuoso katydid by T. Walker, who provided most of the Orthoptera sound recordings for this sound library.For further details, see text.
2004 in Annex II) and new initiatives such as the sound library of the Museum National d'Histoire Naturelle (La sonothèque: https://sonotheque.mnhn.fr/).A list of links to major sound libraries is provided by the International Bioacoustic Council (IBAC 2018).
Digital availability of sound recordings.-Digitization of existing analog recordings in most major sound archives is under way, but there are distinct policies on use and access via the World Wide Web (Baker et al. 2015).At present, most major sound archives provide searchable catalogues of all audio, offering public access to and download options for digitized recordings, under varying license agreements.In some cases, scientific re-use is limited because sound files are made available in compressed formats such as mp3 (ISO/IEC 11172-3:1993).
The following comparison of major sound archives focuses on the number of accessible Orthoptera songs, number of species, and taxonomic compatibility with the Orthoptera Species File (OSF; Cigliano et al. 2018), as well as user-friendliness of web interfaces.Connectivity with the Global Biodiversity Information Facility (GBIF: http://www.gbif.org)was analyzed by a GBIF query for "Orthoptera", adding "audio" multimedia type as additional filter criterion.The number of Orthoptera recordings and species for these major sound archives is summarized in Table 1, including comments on accessibility and particular issues.Archives differ considerably in taxonomic and geographic coverage.Most archives have several recordings for each species, and each archive has strengths and weaknesses summarized in the last column.
While all databases allow extraction of the number of Orthoptera recordings, information about the number of species was not always available.Therefore, it was queried from a table downloaded from GBIF (2015).A close inspection reveals three major contributors: Borror Lab, Animal Sound Archive (= TSA), and ZFMK DOR-SA.Note that GBIF accesses data providers dynamically and the number of records is increasing daily.While the GBIF (2015) dataset contained 3973 occurrences, a more recent Orthoptera/Audio search (GBIF 2017) resulted in 4803 occurrences from 119 species.
Major sound libraries focus on vertebrates, particularly birds, containing few insect recordings.In contrast, SINA (Walker 2004b), OSF (Cigliano et al. 2018) and SYSTAX (SysTax 2017) focus exclusively on Orthoptera.The SYSTAX-DORSA (2017) virtual museum is a repository dedicated to Orthoptera types, song recordings, pictures, and voucher specimens from German institutions and private collections.This database includes 2229 type specimens documented by approximately 25,000 images (Fig. 2).As part of a major digitization initiative funded by the German Research ministry, analog tapes from widely scattered institutional and private sound archives have been digitized (Ingrisch et al. 2004) and made accessible at http://www.systax.organd via GBIF (2017).The digitization of historic analog tapes of ultrasound recordings was particularly challenging, because the appropriate tape recorders for their reproduction are becoming rare.
In summary, accessibility of Orthoptera song recordings in any format is extremely limited.With a total of 26,000 described Orthoptera species of which a (conservatively!) estimated 10,000 are able to stridulate, we have web access to song recordings for about 1000 species, i.e. coverage of a meagre 10% of all stridulating Orthoptera species.Adding another 1000 songs scattered in publications, CDs, books and private collections, we might have song In a letter to Science, Toledo et al. (2015) suggested that scientific journals require deposition of sound files used in publications.Submitting sound as additional online material for publications is certainly a step forward, but will lead to further fragmentation, with valuable sound recordings hidden as supplementary material behind journal paywalls, or distributed over a wide variety of online repositories such as Figshare, Dryad, etc. Instead, a long-term, sustainable archival strategy should be centered around memory institutions, which in general have a longer half-life than states or private companies.Therefore, Riede and Jahn (2013) suggested that researchers submit sound recordings and well-annotated corpora to a few well-established memory institutions, comparable to common practice in genetics.
Traditional targeted song recordings of individual Orthoptera species have now been complemented by acoustic profiling using entire soundscapes (sensu Schafer 1994).Soundscapes are recorded routinely for environmental monitoring (Szeremeta and Zannin 2009) or military uses (Ferguson and Lo 2004).A huge number of recordings is generated by PAM.Following a definition of Marques et al. (2013), PAM "refers loosely to methods using sounds made by animals to make inferences about their distribution and occurrence over space and time."(l.c., p. 290).There is a rapidly increasing number of acoustic monitoring initiatives recording overall soundscapes by Autonomous Recording Units (ARUs), using custom-built or commercial equipment.Acoustic monitoring by microphone arrays is a rapidly developing field, allowing exact 3D mapping of the position of songsters, reviewed by Blumstein et al. (2011).PAM focusses either on endangered vertebrate species or entire soundscapes.
Soundscape projects generally do not even try to identify or assess species compositions, but rather measure overall indices.Sueur et al. (2008b) applied signal analysis to entire soundscapes recorded at Tanzanian coastal forests, measuring entropy as a surrogate for biodiversity richness.Further recordings were made at biodiversity hotspots in New Caledonia and French Guiana (reviewed in Sueur et al. 2014).Such overall bioacoustic indices do not provide information about actual Orthoptera species presence and diversity, but informative snippets could be extracted (Riede andJahn 2013, Lehmann et al. 2014).This means that post-hoc analysis for Orthoptera presence/absence at an ever-increasing number of acoustic monitoring sites is possible, if soundscape recordings would be made available for re-analysis.
The generated data volume is huge, and in most cases not publicly accessible or, as is the case for microphone arrays, not stored at all.Terabytes of acoustic recordings are stored on researchers' hard disks, with a high risk of getting lost, thereby impeding the chance for re-analysis.Only a small number of projects maintain servers to release soundscape recordings for re-analysis.Maintenance and release of soundscape data will provide opportunities and future challenges, as well as valuable data sources for orthopterists, because most PAM recordings from rainforests are dominated by insects, and Orthoptera in particular (Aide et al. 2017).At present, the Purdue soundscape server provides unlimited access to an impressive number of high-quality recordings (Pijanowski et al. 2011, Purdue Sound Ecology Project 2015).The extensive soundscape collection of Krause (2017) is commercial, but nevertheless available for Orthoptera song data mining.

Improving data coverage and requirements for data sharing
Improving data coverage.-Thenumber of species covered by each database presented in Table 1 is not cumulative because there is a strong overlap between DORSA, Tierstimmenarchiv, and OSF, with a strong focus on European species.Exact numbers on SINA (Walker 2004b) are not available, and not every link from OSF to SINA leads to a sound recording.SINA is restricted to North American Ensifera, while Caelifera remain uncovered, apart from some very few historic acridid recordings from the Borror sound archive.For the time being, the best available documentation of North American acridid songs are verbal descriptions and musical annotations by Scudder (1868) and spectrogram figures published by Otte (1981).In light of the incomplete coverage of available sound libraries, filling the gaps for Orthoptera species without any song recording should have highest priority.Because OSF (Cigliano et al. 2018) is a taxonomic hub for all Orthoptera taxa, uploading at least one recording per species would be the most straightforward and efficient way to monitor progress of Orthoptera song coverage and store at least one song recording and/or parameter for each species.The orthopterist community is small and given the excellent communication between OSF curators and authors, the easiest way to increase the OSF song repository would be by proactive encouragement of authors to deposit their available recordings in OSF.
For most species with a SYSTAX-DORSA recording, a "typical" song has already been transferred to OSF, which presently contains songs for 818 species and subspecies (Cigliano et al. 2018).With a considerable number of recordings imported from DORSA, OSF has a similar bias towards European species.The addition of songs from newly described species will sooner or later compensate this imbalance, but incorporation of songs from newly described species grows slowly: According to a "complex search" ("sounds" AND "description date >=2014", extracted 6/9/2015) in OSF, from the 857 recent species described since 2014, only eight sound recordings found their way into OSF: two Neoxabea spp.and three Oecanthus spp.described in Collins et al. (2014), Tettigonia balcanica Chobanov and Lemonnier-Darcemont, 2014 (Chobanov et al. 2014), and two Typophyllum spp.described by Braun (2015).For others (e.g.Walker and Funk 2014, Hemp et al. 2015, Baker et al. 2017) the publications contain detailed song descriptions, while the songs are either deposited outside OSF, or are not accessible at all.However, OSF already contains links, e.g. to Walker and Funk's (2014) recordings, and it would be a comparatively easy task to transfer additional songs to OSF.Likewise, editors of Orthoptera song CDs (e.g.Rentz 1996, Naskrecki 2000) are actively involved in the enrichment of OSF and are probably disposed to contribute their CD recordings.
Problems of data sharing and file exchange.-Atpresent, federated bioacoustics datasets downloaded from GBIF have issues result-ing from unresolved problems between data providers and GBIF.Macaulay (Scholes 2015), Systax (2017) and BioAcoustica (Baker and Rycroft 2017) are registered, citable GBIF data providers, but occurrences disappear once the multimedia audio filter is applied.
In addition, downloading sound files from currently available repositories leads to disintegration of sound file and sound metadata.The safest way to avoid such disintegration is to store metadata within the sound file -typically, a spoken announcement by the recordist often contains information about time, place, temperature, microphone, and recording conditions.However, if this information is clipped for the sake of signal clarity and detectability, a downloaded sound file cannot be attributed to its source and metadata.For SYSTAX-DORSA sound files, the Soundminer software (http://store.soundminer.com/)was used to annotate metadata, showing species name and source when displayed on most devices (Fig. 3).
Embedding metadata within the sound file creates redundancy which can be used to restore or cross-check the links between the original database storing the metadata and the multimedia object.

Future needs: a data warehouse for bioacoustic data
A combination of features from all databases reviewed here probably describes best the requirements for an ideal Orthoptera song data warehouse.In particular: • Baseline collection data such as recordist, time, and locality.
• Cross-reference to voucher specimen, if available: repository (e.g.museum collection), unique identifier (collection number), identifier, and baseline data.• If no voucher specimen is available, an image, video and comments on taxonomic reliability by naming the identifier.• Comprehensive metadata for each recording, in particular temperature, microphone with frequency characteristics and distance from specimen, and preferably sound intensity at a given distance.• User-friendly upload and query interface for input.
On the output side users need: • Advanced search functions.
• Basket function for download of selected songs and/or corpora, including metadata.
Optional requirements include online visualization of sound files (spectrogram/oscillogram), generation of bioacoustic factsheets, and flexible tools for annotation of song parameters.
Building on these basic features, a bioacoustics workbench could provide efficient, reciprocal connection to taxonomic (OSF) and specimen-based federated specimen databases (GBIF).
None of the existing databases fulfill all these requirements.Therefore, the way forward is interoperability and the federation of existing multimedia databases.Commercial or community multimedia providers like the pioneering peer-to-peer filesharing program Napster (https://en.wikipedia.org/wiki/Napster),iTunes, or SoundCloud (https://soundcloud.com/) demonstrate that efficient, user-friendly data management and federation of sound files is feasible, but not designed for scientific use, requiring annotation, citability and sustainability of repositories.GBIF federates specimen data.It allows filtering for audio data, providing multimedia links, but without any interface for direct listening or bulk download via shopping basket functions.However, GBIF is evolving rapidly and is attentive to users' needs.Among the existing sound libraries, BioAcoustica (http://bio.acousti.ca/,Baker et al. 2015) comes closest to the requirements outlined above due to its modular design using cutting-edge technology.
A scheme illustrating elements and workflows of a bioacoustics data warehouse is presented in Fig. 4. A fully developed bioacoustic workbench should allow seamless integration of entire soundscape recordings (as generated by PAM) and tools for managing acoustic scenes, with software for annotation and identification of acoustic snippets (cf.Riede and Jahn 2013), and reference corpora generated from targeted recordings with taxonomically identified voucher specimens.
A well-designed data warehouse infrastructure is the only way to organize efficient workflows between taxonomists (providing reference sound libraries) and computer scientists developing algorithmic recognition tools.Ideally, code and documentation of recognizer software should be publicly accessible through the (virtual) data warehouse, together with the sound libraries and references to voucher specimens.For the time being, it is suggested to establish OSF as a taxonomic backbone to host at least one song recording per species, which would allow for verifying completeness of bioacoustic coverage of singing Orthoptera species.Every sound file could be associated with a unique Life Science Identifier (LSID), comparable to Digital Object Identifiers (DOI), facilitating the necessary cross reference between names, multimedia files, voucher specimens, and eventually genetic sequences.However, at present, a functional LSID architecture is jeopardized by lack of standards (cf.Table 1 in Guralnick et al. 2015).

The way forward: algorithms for acoustic profiling
Well-documented, comprehensive song libraries are the prerequisite for the next logical step, which is acoustic profiling of entire communities.This is particularly promising for lesser known tropical faunas, where acoustic recording could accelerate species assessment.Up to now, overall analysis of Orthoptera communities based on entire soundscapes are still limited to very few sites.Lehmann et al. (2014) used ARUs in the Hymettos mountain range, Greece.Tropical Orthoptera communities have been assessed in the Western Ghats, India (Diwakar et al. 2007), Panama (Schmidt et al. 2013) and Amazonian Ecuador, the latter based exclusively on ethospecies (Riede 1993).Evidence that ethospecies can be reliably attributed to well-defined morphospecies was provided by systematic recording of captured individuals in Ecuadorian lowland and mountain rainforests (Nischk and Riede 2001).
There is a fundamental difference between: 1) automatic classification and identification of individual recordings, consisting of high-quality sound signals of an unknown Orthoptera songster, or; 2) recognition of Orthoptera songs "hidden" within overall soundscape recordings.The two problems are quite distinct, and Fig. 3. Embedding metadata within sound files.Metadata were embedded within wav and mp3 fields directly from the SYSTAX database using Soundminer software (http://store.soundminer.com/).Metadata are visible within most mp3-players, displaying the species name as "TrackTitle" and the recordist as "Artist" (Courtesy: S. Ingrisch).the latter requires additional, complex processing steps.Therefore, they are discussed separately in the following sections.
Classification of individual recordings.-Forindividual recordings, song parameters such as pulse rate and carrier frequency can be easily extracted by basic sound analysis software.These parameters might be sufficient to identify species using a traditional taxonomic key (Ragge and Reynolds 1998, p.83) based on acoustic features.Benediktov (2015) analyzed a calling community of the orthopteran (Tettigoniidae and Gryllidae) community from an agrocenosis in eastern Bulgaria by straight-forward interpretation of spectrograms, showing that valuable information can be extracted from overall recordings "manually", without complex computer algorithms.Such direct comparisons of song parameters with available feature datasets was classified as a "brute force" approach by Tacioli et al. (2017).
More complex software for Orthoptera song identification is based on Artificial Neural Networks (ANN) and Hidden Markov Models (HMM) which are widely used in automatic human speech recognition (Mustafa et al. 2017).Because ANNs have to be trained by a set of training recordings, and later be tested on another validation set, this approach is only possible for identification of species with at least ten recordings of distinct specimens.Dietrich et al. (2004) used ANNs and temporal fusion to classify 31 Orthoptera songs from the DORSA database (Ingrisch et al. 2004).Potamitis et al. (2006) used the SINA repository and some additional resources to test automatic identification of insects using speech recognition tools.In a follow-up publication, Ganchev and Potamitis (2007) applied a hierarchic classification scheme, with identification accuracy that exceeded 99% at suborder and family levels.Chaves et al. (2012) used Costa Rican katydid songs from the Naskrecki (2000) CD for sound parameterization using Mel Frequency Cepstral Coefficients and subsequent classification based on HMM, resulting in high accuracy of identification.Riede et al. (2006) annotated Grylloidea from the SYSTAX-DOR-SA files with essential parameters such as carrier frequency and pulse rate.They applied a batch routine, using segmentation and feature extraction modules developed by Dietrich et al. (2004) to annotate song parameters for hundreds of recordings from 53 species.Tacioli et al. (2017) reviewed basic principles of existing animal sound identification software and implemented a user-friendly, downloadable software (Wildlife Sound Identification Software (WASIS) http://www.naturalhistory.com.br/wasis.html).At present, the underlying reference database contains recordings from Neotropical birds and amphibia, but it should be possible to use this promising approach for Orthoptera song recognition, as well.
Data-mining soundscapes.-Identification of individual species in soundscapes is a much harder task because of noise and highly variable microphone distances from songsters.As a first step, Regions of Interest (ROIs) -sound signals probably containing a song -have to be identified and filtered.In a second step, these Fig. 4. A data warehouse for sound management.The scheme illustrates elements and workflow for acoustic profiling of Orthoptera.Songs are sampled either by recording individual songsters (Targeted Recordings), or entire acoustic scenes, each of which could contain several Orthoptera songs.Targeted recordings are treated like specimens, with time and locality stamps and, preferably, a voucher specimen.All databases listed in Table 1 are designed to store individual recordings.These distributed databases could be federated via ABCDor Darwin-protocol.Soundscapes require distinct data management of large multimedia files.Orthoptera songs could be extracted manually or semi-automatically as sound snippets, and eventually be identified (ID) manually, or using automatic sound recognition algorithms (ASR).Many snippets can be extracted from each scene, resulting in a one-to-many relationship between scenes and snippets.
Journal of orthoptera research 2018, 27(2) ROIs can eventually treated and classified like individual recordings.A considerable number of publications report successful algorithmic identification of sets of bat (Jennings et al. 2008), bird (Potamitis et al. 2014), and frog (Hu et al. 2009) species within field recordings from certain sites.As with individual song recognition software, these algorithms have to be trained, requiring a considerable number of training recordings, preferably from the respective area.
Most recognition software was developed for birds, based on extensive corpora of overall soundscape recordings and high numbers of individual, labelled species recordings used for training and testing.Knight et al. (2017) provide an overview of underlying principles and performance benchmarking of five readily available species recognition programs.Among these programs, the template-based MonitoR software (Katz et al. 2016) is particularly promising, because it is a package implemented in R (https:// www.r-project.org/), a free software project becoming increasingly popular among biologists.In addition, R contains the seewave package (Sueur et al. 2008a), designed for sound analysis and synthesis.Users familiar with R can modify or combine it with other R packages (Sueur 2018).Ovaskainen et al. (2018) developed Animal Sound Identifier (ASI), an interesting toolbox running on Matlab.Unlike most previous approaches, ASI locates training data directly from the field recordings and thus avoids the need for pre-defined reference libraries.Phillips et al. (2018) present an impressive method of reducing audio data to six orders of magnitude, facilitating the interpretation of environmental audio.By clustering vectors of acoustic indices, they were able to attribute clusters to dominant sound sources, such as birds, cicadas, or Orthoptera.They were able to determine Orthoptera calling date and time of day within a huge dataset of 26 months of recordings.With this pre-processing, it should be easy to extract relevant Orthoptera snippets and eventually store them as "ethospecies" (sensu Riede 1993) for future identification.
To facilitate multiple use of sound files for improving algorithms, the respective sound files should be tagged and labelled as a corpus.A wide variety of well-documented corpora is available to be used in computational linguistics and speech recognition.A speech corpus is a well-defined set of speech audio files (Harrington 2010), and a pre-requisite for reproducible results in classifier and recognizer development.Well-curated corpora are not yet available in bioacoustics (cf.Riede and Jahn 2013), which hampers progress of computer-aided analysis.

Discussion
Otte and Alexander (1983) were the first to point out the enormous potential of communicative signal analysis for understanding the systematics and taxonomy of Orthoptera: "It must be clear at this point that those systematists who utilize communicative signals and isolating mechanisms as their principal means of locating and recognizing species are not simply studying biology as well as morphology, or simply using a wide variety of characters, as is commonly and justifiably considered desirable in bio-systematic work.Their entire approach, their methods of analysis, and their interpretations of particular kinds of data are all different.Further, and probably most important, their possibilities for rapid and accurate systematic coverage are unparalleled.For this reason, the groups of animals for which these techniques are possible ought to present unique opportunities for breakthroughs in biogeography and in the study of speciation and other evolutionary phenomena."(l.c., p. 5).Three decades later, bioacous-tic characters of Orthoptera songs frequently form part of species descriptions, taxonomic revisions (e.g.Anatolian Chorthippus species: Mol et al. 2003), as well as phylogenetic studies (Desutter-Grandcolas 2003, Nattier et al. 2011), being a well-established element of a comprehensive, "integrative" taxonomy (Dayrat 2005, Schlick-Steiner et al. 2010).
To mobilize the full potential of sound repositories for biodiversity research, innovative query tools are needed.The vision is to upload a sound recording to a data warehouse portal and search for similar acoustic patterns, comparable to BLAST (Altschul et al. 1990), available as a tool in genetic databases (e.g.NCBI).The potential of such innovative tools will be further enhanced by federated access to distinct sound archives, using one portal with a unified query tool.As a next step, applications running on portable computers could allow classification and identification of songs in the field.Such an infrastructure sounds demanding, but its elements are already available.
Thanks to the rapid technological evolution of hard-and software, complex Artificial Intelligence tools for recognition of human speech, music and animal sounds are now available for personal devices such as smartphones.Commercial programs and apps such as Shazam (for music recognition: https://play.google.com/store/apps/details?id=com.shazam.androidandhl=en_US)or Alexa (for human speech recognition https://play.google.com/store/apps/details?id=com.amazon.dee.appandhl=de) are wellknown examples.Evidently, speech recognition is of considerable military and economic interest, which means that large parts of on-going research are not accessible to the research community.This might be the reason that animal sound recognition lags far behind the performance of the above-mentioned commercial products.
PAM in combination with computer-aided algorithms could lead to major progress in species monitoring and discovery.Lomolino et al. (2015) highlight the potential of these ecoacoustic surveys for biogeography.However, most of the terabytes resulting from PAM are only used to calculate soundscape indices, to be used in landscape ecology (cf.Ross et al. 2018).Ferreira et al. (2018) compared six soundscape indices with sonotype richness in a species-rich Brazilian tropical savanna.A sonotype is equivalent to the acoustic morphospecies (Aide et al. 2017) or ethospecies (Riede 1993).It is recognizable as an individual vocalization, but not necessarily supported by a reliable species identification.Ferreira et al. (2018) showed that the majority of sonotypes could not be attributed to birds.They criticize the bias of several indices on avifauna and emphasize the need to include insects and anurans in ecoacoustics.
While there has been considerable progress in bird song recognition and the labelling of large audio datasets, a comparable milestone has not yet been reached for Orthoptera.This is probably due to insufficient coverage in sound libraries.Orthoptera songs have been documented for less than 20% of described species.Reference sound libraries are missing not only for tropical regions, but are incomplete even for well-known faunas, e.g.North American grasshoppers, despite comprehensive literature including detailed description of communication and spectrograms of songs (Otte 1981).In addition, complex software for training ANNs requires several recordings for each species.Therefore, for simple logical reasons, neither queries nor sophisticated software will produce useful results with reference libraries containing only 1 or 0 recordings for each species.
It must be doubted that self-organizing scientific routine procedures will suffice to establish the necessary infrastructure sketched here.A strong commitment for data sharing as part of good scientific practice is needed, preferably under the leadership of the respective scientific societies such as IBAC or the Orthopterists' Society, together with representatives from major sound archives.The OSF (Cigliano et al. 2018) provides an authoritative taxonomic backbone and tools for the upload and retrieval of sound files.At present, OSF database managers and editors of the Journal of Orthoptera Research encourage submission of sound files together with manuscripts, but there is no obligation.In contrast, submission of gene sequence data to the NCBI is a pre-requisite for publication, resulting in rapid population of gene banks and impressive advances in molecular biology.
Despite promising first results, an efficient connection and data flow between sound archives, museum collections, advanced computational tools and users has not yet been established.Close cooperation of biologists with computer engineers is needed to cope with the data deluge generated by PAM.Again, well-curated and documented song libraries are a prerequisite to exploit bioacoustic Big Data for further biodiversity assessments.
Basically, an efficient acoustic sampling strategy should consist of the following components: 1. Protocols for standardized acoustic recording, at species and community level, using acoustic data loggers for autonomous long-term recordings.2. Open access to and efficient management of sound recordings, song data, and voucher specimens, involving the Orthoptera Species File (OSF: Cigliano et al. 2018) as a taxonomic backbone, and the Global Biodiversity Information Facility (GBIF) for federation of distinct biodiversity multimedia databases.3.An infrastructure for automatic analysis and song classification for on-the ground and web-based analysis, including web2.0 applications for user communities and citizen science.4. A strategic framework for future inventorying and monitoring efforts, including geographic priorities.
Components 1 and 3 involve the entire terrestrial bioacoustics research community, requiring considerable effort to overcome fragmentation between distinct bioacoustic subgroups, clustering around distinct taxa (e.g.frogs, birds, etc.).In contrast, 2 and 4 focus on Orthoptera and are feasible, eventually serving as a model for other species groups.

Conclusions
A recent commentary paper by Deichmann et al. (2018) entitled "It is time to listen" called for a systematic monitoring of rainforest soundscapes.Singing insects are the principal component of these soundscapes.Orthoptera songs are characterized by well-defined signal parameters such as carrier frequency and pulse rate.Acoustic profiling techniques have much to offer, from rapid assessment and species discovery of acoustically active species in remaining wilderness areas to continuous monitoring in managed landscapes.Their full potential can only be developed by cooperative data sharing.At present, an increased wealth of digitized bioacoustic data leads to confusing fragmentation: without the creation of a data warehouse infrastructure, bioacousticians will lose an excellent opportunity to exploit potential synergies from on-going soundscape monitoring initiatives and contribute to urgently needed biodiversity assessments.Likewise, without willingness for data sharing, the newly emerging field of ecoacoustics will generate fragmented soundscape monitoring projects.

Fig. 2 .
Fig. 2. The SYSTAX database.Screenshot of the new SYSTAX user interface, to be released under www.systax.org.A search for the Neotropical tettigoniid genus Anaulacomera recovers several sound recordings from a voucher specimen of a hitherto undescribed species, documented by photographs.Faceting allows searching by images or sounds exclusively.

Table 1 .
Digitized Orthoptera songs in major sound archives and databases.For further details on issues and special features see text.
August et al. 2015, Di Minin et al. 2015)) data for about 2000 species, which is still only 20% of all known stridulating species.If we assume that another 20,000 Orthoptera species still remain to be described (again, a conservative estimate, cf.Stork et al. 2015), we get an idea of the daunting task ahead!Accessibility of new digital recordings.-Theamount of multimedia data documenting animal songs is growing exponentially thanks to Passive Acoustic Monitoring (PAM) and citizen science efforts (cf.August et al. 2015, Di Minin et al. 2015).In addition, behavior and song recordings can be found on YouTube (see Olivero and Robillard 2017, for cricket behavior "in the wild from You-Tube") or as digital supplementary material for scientific journals.