dleucas / wmmsdb (public) (License: GPLv3) (since 2018-07-08) (hash sha1)
A collection of scripts to download, transform and normalize the Watkins Marine Mammal Sound Database.

Credit:

“Watkins Marine Mammal Sound Database, Woods Hole Oceanographic Institution.”

http://cis.whoi.edu/science/B/whalesounds/index.cfm
List of commits:
Subject Hash Author Date (UTC)
WIP document acoustat 4591875fd32c1c91d20133ff90dcf5676b3c216c dleucas 2019-06-07 00:08:18
WIP document acoustat ccc4a6de663a7272ee3d5777fe1479af549e9938 dleucas 2019-06-06 01:39:08
WIP document acoustat 1c6b03267e3016d9b637775df2b4b153866ac040 dleucas 2019-06-05 22:39:39
add dependency on nav.html and pandoc.css b4f054eb6675117d576fa9462220bc5bc8d15be4 dleucas 2019-06-01 01:12:44
nav title 7c5fadd2e143028d614fab4c31ed7389ed17e6f6 dleucas 2019-06-01 01:12:04
document world map 07b70f4b85731456b559edeecea12b339e724aaf dleucas 2019-06-01 01:11:43
use live ElasticSearch URL in example query b6df3b91ba395f62003fa89f1d6ae3f6a705ea9e dleucas 2019-05-31 23:38:40
Document geo coordinates based on the .GC field d3c9fc90773c6252209074583a00b290633f340c dleucas 2019-05-31 23:24:02
transform geo coordinates b32a76ac0917bc3c8bb85e225b19004bc56ac929 dleucas 2019-05-31 22:26:16
update index mapping with geo_point for location coordinates 510a37c03a86603425c6af347c038d69d7fa0cde dleucas 2019-05-31 22:24:42
WIP search by geo distance 61e13ff42bf7e9d283c7d18087c14bb30c2e69c8 dleucas 2019-05-31 22:16:46
about page: show task progress 9372732e21c753f99d706a474cb3a74394ced1dc dleucas 2019-05-24 01:04:07
data page: describe mapping, clean-up table a78264311f89a3b67d40fa8ba3d768c3ec9b2592 dleucas 2019-05-24 00:33:21
add site navigation, remove clutter 9e0361707284cc01d6795195df6848c1f20e88cc dleucas 2019-05-23 01:21:22
WIP initial changelog 12150e01cbb190f83d9e5f02f1431a4856c780d2 dleucas 2019-05-23 01:20:30
add source code links to about page 2fbd58dd9ee99c0568c0c8005bb7becafa8c2163 dleucas 2019-05-23 01:19:28
add download links 7cc06b24732d4167bf8d357c68098f19a66fd01d dleucas 2019-05-23 00:46:36
WIP notes on database fields 2cafc0525643f5dc0ed0a39bb1f914df4fae4f94 dleucas 2019-05-23 00:24:35
WIP notes on database fields 30cc261f4b22da6f6304b33bb57e65b2d145e931 dleucas 2019-05-22 23:43:27
more space for tables 332cdefc80e7d01046dbfe4198c16af92e85f1c4 dleucas 2019-05-22 23:42:50
Commit 4591875fd32c1c91d20133ff90dcf5676b3c216c - WIP document acoustat
Author: dleucas
Author date (UTC): 2019-06-07 00:08
Committer name: dleucas
Committer date (UTC): 2019-06-07 00:08
Parent(s): ccc4a6de663a7272ee3d5777fe1479af549e9938
Signing key:
Tree: 22bdbe14e3dc0f44020ef3d413d5a38272db097c
File Lines added Lines deleted
webroot/changelog.md 25 9
File webroot/changelog.md changed (mode: 100644) (index e39f209..12dbb9b)
... ... on a software tool "Characterizing acoustic features of marine animal sounds".
113 113 > Aggregate Bandwidth, Intensity, Duration, Amplitude Modulation, Frequency Modulation, > Aggregate Bandwidth, Intensity, Duration, Amplitude Modulation, Frequency Modulation,
114 114 > Short-term Bandwidth, Center Frequency, and Amplitude Frequency Interaction. > Short-term Bandwidth, Center Frequency, and Amplitude Frequency Interaction.
115 115
116 This seems like a perfect fit, to gain various statistical properties from the signals time and frequency domain.
116 The paper further asks:
117 117
118 An implementation, of some (but not all) of the acoustic features functions, exists in the popular [seewave][seewave.acoustat] library,
119 written in the [statistical computing language R][R] by Jerome Sueur.
118 > Do the differences in these sound features remain distinctive as the scope of comparison widens?
119 > With our own ears, we can often distinguish acoustic features that appear to be species-specific,
120 > and sometimes features unique to individual animals;
121 > can we specify numerical algorithms that objectively recognize these distinctions?
122
123 The mentioned "Aggregated Bandwidth" or "aggregate power spectrum" function, exists in the popular [seewave][seewave.acoustat] library,
124 written in the [statistical computing language R][R] by Jerôme Sueur.
120 125
121 126 The [manual][seewave.acoustat] states: The [manual][seewave.acoustat] states:
122 127
123 128 > acoustat was originally developed in Matlab language by Fristrup and Watkins (1992). The R function was kindly checked by Kurt Fristrup. > acoustat was originally developed in Matlab language by Fristrup and Watkins (1992). The R function was kindly checked by Kurt Fristrup.
124 129
125 Other methods are to be explored in the future.
130 There are a few things to note, on the default `acoustat()` implementation parameters, compared to the workflow described in the paper.
131
132 - There is no noise filter
133 - The fraction defaults to 10%/90% instead of the papers 25%/75%
134
135 For now the `acoustat()` parameters are **left to the defaults**.
126 136
127 137 #### Implementation #### Implementation
128 138
 
... ... Running the analysis *effectively* requires a task management tool. It keeps tra
134 144 A bash script can run the task in parallel but [GNU Make][make] provides a clear _state_; how far the processing of all sound clips has progressed. A bash script can run the task in parallel but [GNU Make][make] provides a clear _state_; how far the processing of all sound clips has progressed.
135 145 It does that by keeping track of input and output files. If an output JSON file does not exists, the job is not done. It does that by keeping track of input and output files. If an output JSON file does not exists, the job is not done.
136 146
137 This simplified `Makefile` defines `*.wav` files as INPUTS and `*.acoustat.json` as `ACOUSTAT` outputs using `acoustat.json.r` as job processor.
147 This **simplified** `Makefile` defines `*.wav` files as INPUTS and `*.acoustat.json` as `ACOUSTAT` outputs using `acoustat.json.r` as job processor.
138 148
139 149 ```Makefile ```Makefile
140 150 DIR = $(abspath .) DIR = $(abspath .)
 
... ... The number should equal the number of available CPU cores.
186 196 ```bash ```bash
187 197 make -j 8 make -j 8
188 198 ``` ```
189 A single output JSON file looks like this:
199 A single output JSON file looks like this, with P1 and P2 being the lower and upper estimates mentioned in [Fristrup and Watkins (1992 ) report][1912/3055],
200 chapter 2.6 "Aggregate Bandwidth".
190 201
191 202 ```json ```json
192 203 { {
 
... ... A single output JSON file looks like this:
220 231 } }
221 232 ``` ```
222 233
223 The meaning of each value is documented in the [acoustat manual][seewave.acoustat].
234 The meaning of each value is also documented in the [acoustat manual][seewave.acoustat].
224 235
225 236 #### Indexing #### Indexing
226 237
 
... ... jq --raw-output --compact-output -f acoustat.jq *.acoustat.json | curl -s -H "Co
264 275
265 276 #### Results and Discussion #### Results and Discussion
266 277
267 With small changes on the existing Web UI the acoustic features are now available as search filters, but how can they be used during research?
278 With small changes to the existing Web UI the acoustic features are available as search filters, but how can they be used during research?
268 279
269 280 The [1992 Fristrup and Watkins report][1912/3055] outlines the design of the features. The [1992 Fristrup and Watkins report][1912/3055] outlines the design of the features.
270 281
 
... ... The [1992 Fristrup and Watkins report][1912/3055] outlines the design of the fea
272 283
273 284 The paper further explains a correlation test with a subset of 200 sounds clips, to see if species could be distinguished using the statistical features. The paper further explains a correlation test with a subset of 200 sounds clips, to see if species could be distinguished using the statistical features.
274 285
275 It further notes:
286 It notes:
276 287
277 288 > The short-term bandwidth statistics in Table 5, the aggregate bandwidth statistics in > The short-term bandwidth statistics in Table 5, the aggregate bandwidth statistics in
278 289 > Table 6, and the center frequency statistics of Table 7 were the most diagnostic for this set > Table 6, and the center frequency statistics of Table 7 were the most diagnostic for this set
279 290 > of sound sequences. They apparently separated the sounds of different species. > of sound sequences. They apparently separated the sounds of different species.
280 291
292 Due to the mentioned differences in parameters and workflow (and probably sample size), the values from `Table 6` don't translate 1:1 on the current results.
293 A correlation test over the current full result set, might highlight the exact values to distinguish between species.
294
295 For now the technical workflow, of processing all sound clips effectively, is established.
296 More refined parameters and methods are to be explored in the future.
281 297
282 298 #### References #### References
283 299
Hints:
Before first commit, do not forget to setup your git environment:
git config --global user.name "your_name_here"
git config --global user.email "your@email_here"

Clone this repository using HTTP(S):
git clone https://rocketgit.com/user/dleucas/wmmsdb

Clone this repository using ssh (do not forget to upload a key first):
git clone ssh://rocketgit@ssh.rocketgit.com/user/dleucas/wmmsdb

Clone this repository using git:
git clone git://git.rocketgit.com/user/dleucas/wmmsdb

You are allowed to anonymously push to this repository.
This means that your pushed commits will automatically be transformed into a merge request:
... clone the repository ...
... make some changes and some commits ...
git push origin main