RocketGit

dleucas / wmmsdb (public) (License: GPLv3) (since 2018-07-08) (hash sha1)

A collection of scripts to download, transform and normalize the Watkins Marine Mammal Sound Database.

Credit:

“Watkins Marine Mammal Sound Database, Woods Hole Oceanographic Institution.”

http://cis.whoi.edu/science/B/whalesounds/index.cfm

Clone URLs: https://rocketgit.com/user/dleucas/wmmsdb ssh://rocketgit@ssh.rocketgit.com/user/dleucas/wmmsdb git://git.rocketgit.com/user/dleucas/wmmsdb

master species_names

List of commits:

Subject	Hash	Author	Date (UTC)
WIP document acoustat	4591875fd32c1c91d20133ff90dcf5676b3c216c	dleucas	2019-06-07 00:08:18
WIP document acoustat	ccc4a6de663a7272ee3d5777fe1479af549e9938	dleucas	2019-06-06 01:39:08
WIP document acoustat	1c6b03267e3016d9b637775df2b4b153866ac040	dleucas	2019-06-05 22:39:39
add dependency on nav.html and pandoc.css	b4f054eb6675117d576fa9462220bc5bc8d15be4	dleucas	2019-06-01 01:12:44
nav title	7c5fadd2e143028d614fab4c31ed7389ed17e6f6	dleucas	2019-06-01 01:12:04
document world map	07b70f4b85731456b559edeecea12b339e724aaf	dleucas	2019-06-01 01:11:43
use live ElasticSearch URL in example query	b6df3b91ba395f62003fa89f1d6ae3f6a705ea9e	dleucas	2019-05-31 23:38:40
Document geo coordinates based on the .GC field	d3c9fc90773c6252209074583a00b290633f340c	dleucas	2019-05-31 23:24:02
transform geo coordinates	b32a76ac0917bc3c8bb85e225b19004bc56ac929	dleucas	2019-05-31 22:26:16
update index mapping with geo_point for location coordinates	510a37c03a86603425c6af347c038d69d7fa0cde	dleucas	2019-05-31 22:24:42
WIP search by geo distance	61e13ff42bf7e9d283c7d18087c14bb30c2e69c8	dleucas	2019-05-31 22:16:46
about page: show task progress	9372732e21c753f99d706a474cb3a74394ced1dc	dleucas	2019-05-24 01:04:07
data page: describe mapping, clean-up table	a78264311f89a3b67d40fa8ba3d768c3ec9b2592	dleucas	2019-05-24 00:33:21
add site navigation, remove clutter	9e0361707284cc01d6795195df6848c1f20e88cc	dleucas	2019-05-23 01:21:22
WIP initial changelog	12150e01cbb190f83d9e5f02f1431a4856c780d2	dleucas	2019-05-23 01:20:30
add source code links to about page	2fbd58dd9ee99c0568c0c8005bb7becafa8c2163	dleucas	2019-05-23 01:19:28
add download links	7cc06b24732d4167bf8d357c68098f19a66fd01d	dleucas	2019-05-23 00:46:36
WIP notes on database fields	2cafc0525643f5dc0ed0a39bb1f914df4fae4f94	dleucas	2019-05-23 00:24:35
WIP notes on database fields	30cc261f4b22da6f6304b33bb57e65b2d145e931	dleucas	2019-05-22 23:43:27
more space for tables	332cdefc80e7d01046dbfe4198c16af92e85f1c4	dleucas	2019-05-22 23:42:50

Commit 4591875fd32c1c91d20133ff90dcf5676b3c216c - WIP document acoustat
Author: dleucas
Author date (UTC): 2019-06-07 00:08
Committer name: dleucas
Committer date (UTC): 2019-06-07 00:08
Parent(s): ccc4a6de663a7272ee3d5777fe1479af549e9938
Signing key:
Tree: 22bdbe14e3dc0f44020ef3d413d5a38272db097c

File	Lines added	Lines deleted
webroot/changelog.md	25	9

File webroot/changelog.md changed (mode: 100644) (index e39f209..12dbb9b)
...	...	on a software tool "Characterizing acoustic features of marine animal sounds".
113	113	> Aggregate Bandwidth, Intensity, Duration, Amplitude Modulation, Frequency Modulation,	> Aggregate Bandwidth, Intensity, Duration, Amplitude Modulation, Frequency Modulation,
114	114	> Short-term Bandwidth, Center Frequency, and Amplitude Frequency Interaction.	> Short-term Bandwidth, Center Frequency, and Amplitude Frequency Interaction.
115	115
116		This seems like a perfect fit, to gain various statistical properties from the signals time and frequency domain.
	116		The paper further asks:
117	117
118		An implementation, of some (but not all) of the acoustic features functions, exists in the popular [seewave][seewave.acoustat] library,
119		written in the [statistical computing language R][R] by Jerome Sueur.
	118		> Do the differences in these sound features remain distinctive as the scope of comparison widens?
	119		> With our own ears, we can often distinguish acoustic features that appear to be species-specific,
	120		> and sometimes features unique to individual animals;
	121		> can we specify numerical algorithms that objectively recognize these distinctions?
	122
	123		The mentioned "Aggregated Bandwidth" or "aggregate power spectrum" function, exists in the popular [seewave][seewave.acoustat] library,
	124		written in the [statistical computing language R][R] by Jerôme Sueur.
120	125
121	126	The [manual][seewave.acoustat] states:	The [manual][seewave.acoustat] states:
122	127
123	128	> acoustat was originally developed in Matlab language by Fristrup and Watkins (1992). The R function was kindly checked by Kurt Fristrup.	> acoustat was originally developed in Matlab language by Fristrup and Watkins (1992). The R function was kindly checked by Kurt Fristrup.
124	129
125		Other methods are to be explored in the future.
	130		There are a few things to note, on the default `acoustat()` implementation parameters, compared to the workflow described in the paper.
	131
	132		- There is no noise filter
	133		- The fraction defaults to 10%/90% instead of the papers 25%/75%
	134
	135		For now the `acoustat()` parameters are left to the defaults.
126	136
127	137	#### Implementation	#### Implementation
128	138

...	...	Running the analysis effectively* requires a task management tool. It keeps tra*
134	144	A bash script can run the task in parallel but [GNU Make][make] provides a clear _state_; how far the processing of all sound clips has progressed.	A bash script can run the task in parallel but [GNU Make][make] provides a clear _state_; how far the processing of all sound clips has progressed.
135	145	It does that by keeping track of input and output files. If an output JSON file does not exists, the job is not done.	It does that by keeping track of input and output files. If an output JSON file does not exists, the job is not done.
136	146
137		This simplified `Makefile` defines `.wav` files as INPUTS and `.acoustat.json` as `ACOUSTAT` outputs using `acoustat.json.r` as job processor.
	147		This simplified `Makefile` defines `.wav` files as INPUTS and `.acoustat.json` as `ACOUSTAT` outputs using `acoustat.json.r` as job processor.
138	148
139	149	```Makefile	```Makefile
140	150	DIR = $(abspath .)	DIR = $(abspath .)

...	...	The number should equal the number of available CPU cores.
186	196	```bash	```bash
187	197	make -j 8	make -j 8
188	198	```	```
189		A single output JSON file looks like this:
	199		A single output JSON file looks like this, with P1 and P2 being the lower and upper estimates mentioned in [Fristrup and Watkins (1992 ) report][1912/3055],
	200		chapter 2.6 "Aggregate Bandwidth".
190	201
191	202	```json	```json
192	203	{	{

...	...	A single output JSON file looks like this:
220	231	}	}
221	232	```	```
222	233
223		The meaning of each value is documented in the [acoustat manual][seewave.acoustat].
	234		The meaning of each value is also documented in the [acoustat manual][seewave.acoustat].
224	235
225	236	#### Indexing	#### Indexing
226	237

...	...	jq --raw-output --compact-output -f acoustat.jq .acoustat.json \| curl -s -H "Co*
264	275
265	276	#### Results and Discussion	#### Results and Discussion
266	277
267		With small changes on the existing Web UI the acoustic features are now available as search filters, but how can they be used during research?
	278		With small changes to the existing Web UI the acoustic features are available as search filters, but how can they be used during research?
268	279
269	280	The [1992 Fristrup and Watkins report][1912/3055] outlines the design of the features.	The [1992 Fristrup and Watkins report][1912/3055] outlines the design of the features.
270	281

...	...	The [1992 Fristrup and Watkins report][1912/3055] outlines the design of the fea
272	283
273	284	The paper further explains a correlation test with a subset of 200 sounds clips, to see if species could be distinguished using the statistical features.	The paper further explains a correlation test with a subset of 200 sounds clips, to see if species could be distinguished using the statistical features.
274	285
275		It further notes:
	286		It notes:
276	287
277	288	> The short-term bandwidth statistics in Table 5, the aggregate bandwidth statistics in	> The short-term bandwidth statistics in Table 5, the aggregate bandwidth statistics in
278	289	> Table 6, and the center frequency statistics of Table 7 were the most diagnostic for this set	> Table 6, and the center frequency statistics of Table 7 were the most diagnostic for this set
279	290	> of sound sequences. They apparently separated the sounds of different species.	> of sound sequences. They apparently separated the sounds of different species.
280	291
	292		Due to the mentioned differences in parameters and workflow (and probably sample size), the values from `Table 6` don't translate 1:1 on the current results.
	293		A correlation test over the current full result set, might highlight the exact values to distinguish between species.
	294
	295		For now the technical workflow, of processing all sound clips effectively, is established.
	296		More refined parameters and methods are to be explored in the future.
281	297
282	298	#### References	#### References
283	299

Hints:
Before first commit, do not forget to setup your git environment:

git config --global user.name "your_name_here"
git config --global user.email "your@email_here"

Clone this repository using HTTP(S):

git clone https://rocketgit.com/user/dleucas/wmmsdb

Clone this repository using ssh (do not forget to upload a key first):

git clone ssh://rocketgit@ssh.rocketgit.com/user/dleucas/wmmsdb

Clone this repository using git:

git clone git://git.rocketgit.com/user/dleucas/wmmsdb

You are allowed to anonymously push to this repository.
This means that your pushed commits will automatically be transformed into a merge request:

... clone the repository ...
... make some changes and some commits ...
git push origin main