dleucas / wmmsdb (public) (License: GPLv3) (since 2018-07-08) (hash sha1)
A collection of scripts to download, transform and normalize the Watkins Marine Mammal Sound Database.

Credit:

“Watkins Marine Mammal Sound Database, Woods Hole Oceanographic Institution.”

http://cis.whoi.edu/science/B/whalesounds/index.cfm
List of commits:
Subject Hash Author Date (UTC)
WIP notes on database fields 2cafc0525643f5dc0ed0a39bb1f914df4fae4f94 dleucas 2019-05-23 00:24:35
WIP notes on database fields 30cc261f4b22da6f6304b33bb57e65b2d145e931 dleucas 2019-05-22 23:43:27
more space for tables 332cdefc80e7d01046dbfe4198c16af92e85f1c4 dleucas 2019-05-22 23:42:50
spelling 48bfa46f40db11b1d120a113502d85916d166db9 dleucas 2019-05-22 23:42:22
ignore generated html files c303061b3e28314361be79b7eea01e38deb4cc52 dleucas 2019-05-22 22:39:25
add Makefile for pandoc based html5 generation 5537d1647e4b76b86bc54b9ce6383feb9618d226 dleucas 2019-05-22 22:37:48
fix spelling and use pandoc title d66f11d8c855ef270ebde81ff5a55473729633f2 dleucas 2019-05-22 22:30:08
Revert "fix spelling and use pandoc title feature" 4a5aca9355e0a2221591047e7b4b56b5e42414e6 dleucas 2019-05-22 22:23:50
fix spelling and use pandoc title feature dc3c3f6b98235fb9d7268ccd5b5c5fb62b52a93c dleucas 2019-05-22 21:54:03
WIP about page 2483c1505cd3379dc9609661e90a4718ea4f6aeb dleucas 2019-05-20 00:07:03
better site description 559a9ece7031d493cec54754aa8f125446e8ad37 dleucas 2019-05-17 15:46:45
add link to world map e865344ba45c24781889924747d0777bbd74128c dleucas 2019-05-17 15:41:51
add map using arcgis.js facbffd944ef65cdea19ec292633b75d1232cd2b dleucas 2019-05-17 14:52:40
add location and date 55d68f71ef5da4edf39b2aad1ac0e55c9236e3b8 dleucas 2019-05-17 14:47:50
document GeoJSON command 8a0c9437ad2c7920d48125ab16795e9d2eb4aae7 dleucas 2019-05-17 02:43:44
working coordinate conversion 93690c423d103a8c643eb736c49f5086c97569ab dleucas 2019-05-17 02:26:16
WIP GeoJSON transformation 0fb3c00fe05abdab66a505526412273340a57c02 dleucas 2018-08-04 03:16:55
add filter by location name 043ca941dd733b0c36274576ec17680255f75646 dleucas 2018-07-25 20:13:46
index settings for location name. also use english for notes 6ccc4e6332fb40b2556c2570e9c3cdbb5a48f8a3 dleucas 2018-07-25 20:13:19
add location names c67e4b4eea6c4ef16337e3eac6eb10ae16393430 dleucas 2018-07-25 18:32:45
Commit 2cafc0525643f5dc0ed0a39bb1f914df4fae4f94 - WIP notes on database fields
Author: dleucas
Author date (UTC): 2019-05-23 00:24
Committer name: dleucas
Committer date (UTC): 2019-05-23 00:24
Parent(s): 30cc261f4b22da6f6304b33bb57e65b2d145e931
Signer:
Signing key:
Signing status: N
Tree: b113960d94e29fd52d2b889773bab95cf6289a86
File Lines added Lines deleted
webroot/data.md 46 38
File webroot/data.md changed (mode: 100644) (index 929aa71..14d2ee1)
... ... Note that the database is not available in it's native DOS based format, as crea
10 10
11 11 [whoi.report]: https://whoicf2.whoi.edu/science/B/whalesounds/WHOI-92-31.pdf [whoi.report]: https://whoicf2.whoi.edu/science/B/whalesounds/WHOI-92-31.pdf
12 12
13 ### Database fields
14
15 Total number of records 15254
16
17 | Field | Description | Missing | Unique | Multi-Valued | Note |
18 |-------+-------------+---------+--------+--------------|------|
19 | RN | Record Number | 0 | 15254 | No | Record Number: always present, all unique |
20 | CU | Cue | 89 | | No | Cue: 12631 contain "B" buffer size flag |
21 | NC | Number of Audio Channels | 8 | 40 | No | Some invalid formatting |
22 | SR | Sample Rate | 5 | 48 | No | Some with dot as delimiter and mixed khz hz writing |
23 | CS | Cut Size | 11 | 6131 | No | Mostly seconds (n.n+), some with minutes (n:n.n+) |
24 | PL | Recorder | 7 | 253 | No | |
25 | SC | signal class| 952 | 26 | No | quality not always present, flags in no order |
26 | ID | vocal animal id | 13338 | 18 | Yes | species code not always present |
27 | AG | age | 14769 | 13 | Yes | using ? as placeholder if age is unknown, species code might be name |
28 | IA | interaction | 15211 | 5 | Yes | multiple interactions with pipe separated, always in pairs |
29 | GS | genus | 0 | 307 | Yes | pipe separated, other species codes X / O / E |
30 | GA | geo A code | 20 | 194 | Yes | pipe separated |
31 | OD | observation date | 0 | 496 | Yes | pipe separated |
32 | NT | note | 4 | 5398 | No | free text |
33 | DA | record date | 30 | 437 | No | Month written 3 or 4 letters, some extra noise |
34 | IP | ID of con present | 15 records | 2 | Yes | pipe separated |
35 | AG | age of con present | 15 records | 2 | | |
36 | BH | behavior | 2442 records | 48 | No | some variation/free text, normalize? |
37 | OS | other species | 3995 records | 75 | Yes | pipe separated, not vocalizing species? |
38 | NA | number of animals vocalizing | 14889 records | 420 | Yes | ranges 1-2, or 1+, handle space, pipe separated, some noise |
39 | GB | Geo B | 13354 records | 362 |
40 | GC | Geo C | 13910 records | 224 | Yes | pipe separated |
41 | OT | observation time | 7141 records | ? | Yes | sometimes range nnnn - nnnn, pipe or ; separated |
42 | SH | ship | 13675 records | 62 |
43 | AU | author | 14204 records | 58 | Yes | pipe separated |
44 | LO | 16 |
45 | HY | 8075 |
46 | RC | 9524 |
47 | RG | 2208 |
48 | SL | 15253 |
49 | ST | 1648 |
50
13 ### Database Fields
14
15 The following table describes each database field (work in progress).
16
17 | Field | Description | Records | Missing | Unique | Multi-Valued | Note |
18 |------:+-------------+--------:+--------:+-------:+--------------+------|
19 | RN | Record Number | 15254 | 0 | 15254 | No | Always present, all unique |
20 | CU | Cue | | 89 | | No | 12631 contain "B" buffer size flag |
21 | NC | Number of Audio Channels | | 8 | 40 | No | Some invalid formatting |
22 | SR | Sample Rate | | 5 | 48 | No | Some with dot as delimiter and mixed khz hz writing |
23 | CS | Cut Size | | 11 | 6131 | No | Mostly seconds (n.n+), some with minutes (n:n.n+) |
24 | PL | Recorder | | 7 | 253 | No | |
25 | SC | Signal Class| | 952 | 26 | No | quality not always present, flags in no order |
26 | ID | Vocal Animal ID | | 13338 | 18 | Yes | species code not always present |
27 | AG | Age | | 14769 | 13 | Yes | using ? as placeholder if age is unknown, species code might be name |
28 | IA | Interaction | | 15211 | 5 | Yes | multiple interactions with pipe separated, always in pairs |
29 | GS | Genus | | 0 | 307 | Yes | pipe separated, other species codes X / O / E |
30 | GA | Geo A Code | | 20 | 194 | Yes | pipe separated |
31 | OD | Observation Date | | 0 | 496 | Yes | pipe separated |
32 | NT | Note | | 4 | 5398 | No | Free Text |
33 | DA | Record Date | | 30 | 437 | No | Month written 3 or 4 letters, some extra noise |
34 | IP | ID of con present | 15 | | 2 | Yes | pipe separated |
35 | AG | Age of con present | 15 | | 2 | | |
36 | BH | Behavior | 2442 | | 48 | No | some variation/free text, normalize? |
37 | OS | Other Species | 3995 | | 75 | Yes | pipe separated, not vocalizing species? |
38 | NA | Number of Animals Vocalizing | 14889 | | 420 | Yes | ranges 1-2, or 1+, handle space, pipe separated, some noise |
39 | GB | Geo B | 13354 | | 362 | | |
40 | GC | Geo C | 13910 | | 224 | Yes | pipe separated |
41 | OT | Observation Time | 7141 | | | Yes | sometimes range nnnn - nnnn, pipe or ; separated |
42 | SH | Ship | 13675 | | 62 | No | |
43 | AU | Author | 14204 | | 58 | Yes | pipe separated |
44 | LO | | 16 | | | | |
45 | HY | | 8075 | | | | |
46 | RC | | 9524 | | | | |
47 | RG | | 2208 | | | | |
48 | SL | | 15253 | | | | |
49 | ST | | 1648 | | | | |
50
51 ### Examples and Transformations
52
53 (TODO)
54
55 #### GS / Genus
56
57 (TODO)
58
Hints:
Before first commit, do not forget to setup your git environment:
git config --global user.name "your_name_here"
git config --global user.email "your@email_here"

Clone this repository using HTTP(S):
git clone https://rocketgit.com/user/dleucas/wmmsdb

Clone this repository using ssh (do not forget to upload a key first):
git clone ssh://rocketgit@ssh.rocketgit.com/user/dleucas/wmmsdb

Clone this repository using git:
git clone git://git.rocketgit.com/user/dleucas/wmmsdb

You are allowed to anonymously push to this repository.
This means that your pushed commits will automatically be transformed into a merge request:
... clone the repository ...
... make some changes and some commits ...
git push origin main