dleucas / wmmsdb (public) (License: GPLv3) (since 2018-07-08) (hash sha1)
A collection of scripts to download, transform and normalize the Watkins Marine Mammal Sound Database.

Credit:

“Watkins Marine Mammal Sound Database, Woods Hole Oceanographic Institution.”

http://cis.whoi.edu/science/B/whalesounds/index.cfm
List of commits:
Subject Hash Author Date (UTC)
extract and transform ship names 12eb822d4ae70fbd2f01b0f03a77eee5e0d9a2b7 dleucas 2021-12-19 14:40:54
extract and transform observation time 595eb911201f8697c9c5f1aa1c40ee502e8009c2 dleucas 2021-12-19 12:04:29
formating f82ed6a3f1c0dccfc50aaa2a0fa3e866f5741acf dleucas 2021-12-18 23:18:45
update readme for GeoJSON command e5c1a6f49ea17a8860cecffdcf280d1d2fdb3ac2 dleucas 2021-12-18 23:13:31
add species names to GeoJSON and World Map 86af1d9597c3cb3c58f71c777902148d2ce333f8 dleucas 2021-12-18 22:51:11
re-worked GeoJSON from transformed JSON. identical output. added license 8ec5e1678966c6b1c441d035a6833f55586325ec dleucas 2021-12-18 22:01:19
remove genus, add species common and scientific names, display names in record details e2025219ade78f9ac2e6dfd81c479d242b9db24b dleucas 2021-12-18 17:18:28
add common and scientific names to schema b4c7dcb4042b9d1cd3da24123f4506113a876f02 dleucas 2021-12-18 17:17:17
remove genus, add species and type to type_of namespace change 2a7757a6c3e6dbb784b41d7a4b29381bc9105803 dleucas 2021-12-18 17:16:04
converted all transforms to functions 9d37303b24abc225f11f9c3d3b622c5167ed43e3 dleucas 2021-12-18 16:00:04
more conversion to functions. match old output for now fc4a8157a6902f4571b54c6ab84174f005adbe0d dleucas 2021-12-18 10:26:23
WIP convert filters to functions 32badc3512dd9094d51ba2cc2ef8112eba2698bf dleucas 2021-12-16 18:33:21
convert html only once. extract species names as json. formating and lint. e755dc7f4fe2d7c9b97826a0f3f2cf5385e90ef9 dleucas 2021-12-16 13:35:31
download once. use wget only. get species names. test for commands. formating 572dbf1eaffe17c43a4a01dc9675737628c5a234 dleucas 2021-12-16 12:14:26
add filter by behavior type, sort by modified date c3f9f9f9d9501e714117af7fff573e7f3fa4052b dleucas 2019-06-14 03:51:19
rename type to type_of 4269dc257530a9a7fa21ff8708f4594a2f1a453d dleucas 2019-06-14 03:39:49
ElasticSearch setting for larger HTTP request e83e501f949473538096f984220934c0a51de0b4 dleucas 2019-06-14 03:25:22
rename type to type_of e1fcd27b05eabc8bce06751a9925200e4707168b dleucas 2019-06-14 02:34:15
add animal behavior transformation and documentation 7550db3bbd1c69c9369cf8dfe3a5d1195e761ae2 dleucas 2019-06-14 00:57:16
add lost modified date c4922a44cebebd63da6c23a2a71f97cdb47b4a68 dleucas 2019-06-12 22:46:35
Commit 12eb822d4ae70fbd2f01b0f03a77eee5e0d9a2b7 - extract and transform ship names
Author: dleucas
Author date (UTC): 2021-12-19 14:40
Committer name: dleucas
Committer date (UTC): 2021-12-19 14:40
Parent(s): 595eb911201f8697c9c5f1aa1c40ee502e8009c2
Signing key:
Tree: 126c9c9c15715a2dfecf868750004dcda18be005
File Lines added Lines deleted
transform.jq 40 0
webroot/data.md 1 1
File transform.jq changed (mode: 100755) (index c6e526d..0a89915)
... ... def as_observation_time:
87 87 } }
88 88 else null end; else null end;
89 89
90 def as_ship:
91 # Ship/Cruise, aquarium, or other platform
92 #
93 # Database contains various ship names, sometimes species code and short codes.
94 # Difficult to normalize with regex alone, manually fix the rest.
95 #
96 # Example source data:
97 # Abel-J
98 # ABEL J
99 # ABEL-J
100 # Annandale
101 # Bear 265
102 # BEAR 265
103 # BE (captive in a 75' x 45' enclosure)
104 # Hydrophone dangled from end of breakwater
105 {
106 "SH": "Ship",
107 "SM": "Small Boat",
108 "AQ": "Aquarium",
109 "BE": "Beach",
110 "IC": "Ice",
111 "UN": "Unknown",
112 } as $platform |
113 # remove species code
114 sub("\\b[A-C][A-Z]\\d{1,2}[A-Z]$"; "") |
115 # clean spaces
116 trim | sub("\\s{2,}"; " ") |
117 # map short codes
118 if . | length == 2 then $platform[.]
119 # normalize special cases
120 elif .[:5] == "Bear " then . | ascii_upcase
121 elif . == "Abel-J" then "ABEL-J"
122 elif . == "ABEL J" then "ABEL-J"
123 elif . == "IDA Z" then "IDA-Z"
124 elif . == "IDA- Z" then "IDA-Z"
125 elif . == "Risk" then "RISK"
126 elif . == "" then null
127 else . end;
128
90 129 def as_location_name: def as_location_name:
91 130 # Location Name # Location Name
92 131 # Remove species code and whitespace # Remove species code and whitespace
 
... ... def as_sound_sample_rate:
426 465 name: .GB | split("|") | map(as_location_name), name: .GB | split("|") | map(as_location_name),
427 466 coordinates: .GC | split("|") | as_location_coordinates coordinates: .GC | split("|") | as_location_coordinates
428 467 }, },
468 ship: .SH | as_ship,
429 469 # object contains properties of the captured signal # object contains properties of the captured signal
430 470 signal: { signal: {
431 471 position: [ position: [
File webroot/data.md changed (mode: 100644) (index 29fb0a3..319432a)
... ... Generally the database is very densely encoded with information due to the DOS-A
66 66 | GB | Geo B | 13354 | | 362 | | .location.name[] | | GB | Geo B | 13354 | | 362 | | .location.name[] |
67 67 | GC | Geo C | 13910 | | 224 | Yes | .location.coordinates[] | | GC | Geo C | 13910 | | 224 | Yes | .location.coordinates[] |
68 68 | OT | Observation Time | 7141 | | 450 | Yes | .observation_time[] | | OT | Observation Time | 7141 | | 450 | Yes | .observation_time[] |
69 | SH | Ship | 13675 | | 62 | No | |
69 | SH | Ship | 13675 | | 62 | No | .ship |
70 70 | AU | Author | 14204 | | 58 | Yes | | | AU | Author | 14204 | | 58 | Yes | |
71 71 | LO | Storage location of original recording | | 16 | 134 | Yes | | | LO | Storage location of original recording | | 16 | 134 | Yes | |
72 72 | HY | Hydrophone depth in m | 8075 | | 51 | Yes | | | HY | Hydrophone depth in m | 8075 | | 51 | Yes | |
Hints:
Before first commit, do not forget to setup your git environment:
git config --global user.name "your_name_here"
git config --global user.email "your@email_here"

Clone this repository using HTTP(S):
git clone https://rocketgit.com/user/dleucas/wmmsdb

Clone this repository using ssh (do not forget to upload a key first):
git clone ssh://rocketgit@ssh.rocketgit.com/user/dleucas/wmmsdb

Clone this repository using git:
git clone git://git.rocketgit.com/user/dleucas/wmmsdb

You are allowed to anonymously push to this repository.
This means that your pushed commits will automatically be transformed into a merge request:
... clone the repository ...
... make some changes and some commits ...
git push origin main