File webroot/changelog.md changed (mode: 100644) (index b39b97a..e39f209) |
1 |
1 |
% Changelog |
% Changelog |
|
2 |
|
% |
|
3 |
|
% Last Update: June 6, 2019 |
2 |
4 |
|
|
3 |
5 |
## Protocol of the projects development history |
## Protocol of the projects development history |
4 |
6 |
|
|
|
... |
... |
To create the GeoJSON file run the following command in the [source code][src] t |
93 |
95 |
|
|
94 |
96 |
Example Queries (TODO) |
Example Queries (TODO) |
95 |
97 |
|
|
96 |
|
### 24. July 2018: Analyzing sound clips with Acoustat |
|
|
98 |
|
### 24. July 2018: Analyzing Watkins sound clips for acoustic features |
97 |
99 |
|
|
98 |
100 |
Enriching Watkins sound database, is one opportunity to be explored with this project. |
Enriching Watkins sound database, is one opportunity to be explored with this project. |
99 |
|
Each sound clip is well described by the database, but there is nothing providing insight into the actual signal characteristics. |
|
|
101 |
|
The contents of each sound clip are well described by the database, but there is nothing providing insight into the actual signal characteristics. |
100 |
102 |
Even simple properties like clip duration are not available. |
Even simple properties like clip duration are not available. |
101 |
103 |
|
|
102 |
104 |
Automatically analyzing ~15.000 sound clips, might not have been an option with |
Automatically analyzing ~15.000 sound clips, might not have been an option with |
103 |
|
affordable PC hardware resources in the 1990s, but any current day machine can handle this task in reasonable time. |
|
|
105 |
|
affordable PC hardware in the 1990s, but any current day machine can handle this task in reasonable time. |
104 |
106 |
|
|
105 |
107 |
#### Method |
#### Method |
106 |
108 |
|
|
|
... |
... |
on a software tool "Characterizing acoustic features of marine animal sounds". |
113 |
115 |
|
|
114 |
116 |
This seems like a perfect fit, to gain various statistical properties from the signals time and frequency domain. |
This seems like a perfect fit, to gain various statistical properties from the signals time and frequency domain. |
115 |
117 |
|
|
|
118 |
|
An implementation, of some (but not all) of the acoustic features functions, exists in the popular [seewave][seewave.acoustat] library, |
|
119 |
|
written in the [statistical computing language R][R] by Jerome Sueur. |
|
120 |
|
|
|
121 |
|
The [manual][seewave.acoustat] states: |
|
122 |
|
|
|
123 |
|
> acoustat was originally developed in Matlab language by Fristrup and Watkins (1992). The R function was kindly checked by Kurt Fristrup. |
|
124 |
|
|
|
125 |
|
Other methods are to be explored in the future. |
|
126 |
|
|
116 |
127 |
#### Implementation |
#### Implementation |
117 |
128 |
|
|
118 |
129 |
Downloading all sound clips is left as exercise for the reader. Please be reasonable and don't overload the WHOI server. |
Downloading all sound clips is left as exercise for the reader. Please be reasonable and don't overload the WHOI server. |
|
... |
... |
Downloading all sound clips is left as exercise for the reader. Please be reason |
120 |
131 |
The remaining job is fairly simple: load the signal, run the statistics and store the result as JSON files, for further indexing in ElasticSearch. |
The remaining job is fairly simple: load the signal, run the statistics and store the result as JSON files, for further indexing in ElasticSearch. |
121 |
132 |
|
|
122 |
133 |
Running the analysis *effectively* requires a task management tool. It keeps track of the progress, can resume a aborted run and allows parallel execution of tasks. |
Running the analysis *effectively* requires a task management tool. It keeps track of the progress, can resume a aborted run and allows parallel execution of tasks. |
123 |
|
A bash script can run the task in parallel but GNU Make provides a clear _state_; how far the processing of all sound clips has progressed. |
|
|
134 |
|
A bash script can run the task in parallel but [GNU Make][make] provides a clear _state_; how far the processing of all sound clips has progressed. |
124 |
135 |
It does that by keeping track of input and output files. If an output JSON file does not exists, the job is not done. |
It does that by keeping track of input and output files. If an output JSON file does not exists, the job is not done. |
125 |
136 |
|
|
126 |
137 |
This simplified `Makefile` defines `*.wav` files as INPUTS and `*.acoustat.json` as `ACOUSTAT` outputs using `acoustat.json.r` as job processor. |
This simplified `Makefile` defines `*.wav` files as INPUTS and `*.acoustat.json` as `ACOUSTAT` outputs using `acoustat.json.r` as job processor. |
|
... |
... |
library("methods") |
158 |
169 |
argv = commandArgs(trailingOnly = TRUE) |
argv = commandArgs(trailingOnly = TRUE) |
159 |
170 |
wav = tuneR::readWave(argv[1]) |
wav = tuneR::readWave(argv[1]) |
160 |
171 |
stat = seewave::acoustat(wave=wav, plot = FALSE) |
stat = seewave::acoustat(wave=wav, plot = FALSE) |
161 |
|
# remove unwated contour data |
|
|
172 |
|
# remove unwanted contour data |
162 |
173 |
stat$freq.contour <- NULL |
stat$freq.contour <- NULL |
163 |
174 |
stat$time.contour <- NULL |
stat$time.contour <- NULL |
164 |
175 |
# assign record number as id |
# assign record number as id |
165 |
176 |
stat$id <- argv[3] |
stat$id <- argv[3] |
166 |
177 |
write_json(stat, argv[2]) |
write_json(stat, argv[2]) |
167 |
178 |
``` |
``` |
|
179 |
|
The first line allows execution as a shell script and ensures a clean R environment. To avoid cluttered output during execution, various library messages are silenced. |
168 |
180 |
|
|
169 |
|
The output JSON file looks like this: |
|
|
181 |
|
#### Execution |
|
182 |
|
|
|
183 |
|
All sound files, `Makefile` and `acoustat.json.r` script are placed in the same directory and the following command runs the analysis with 8 parallel processes. |
|
184 |
|
The number should equal the number of available CPU cores. |
|
185 |
|
|
|
186 |
|
```bash |
|
187 |
|
make -j 8 |
|
188 |
|
``` |
|
189 |
|
A single output JSON file looks like this: |
170 |
190 |
|
|
171 |
|
```JSON |
|
|
191 |
|
```json |
172 |
192 |
{ |
{ |
173 |
193 |
"time.P1": [ |
"time.P1": [ |
174 |
194 |
0.1157 |
0.1157 |
|
... |
... |
The output JSON file looks like this: |
200 |
220 |
} |
} |
201 |
221 |
``` |
``` |
202 |
222 |
|
|
203 |
|
#### Results and Usage |
|
|
223 |
|
The meaning of each value is documented in the [acoustat manual][seewave.acoustat]. |
|
224 |
|
|
|
225 |
|
#### Indexing |
|
226 |
|
|
|
227 |
|
As a last step the raw values are mapped under `.sound.freq` and `.sound.time` of the existing JSON document tree. |
|
228 |
|
|
|
229 |
|
A `acoustat.jq` script transforms the JSON data for the ElasticSearch [bulk import API][elastic.bulk]. |
|
230 |
|
|
|
231 |
|
```jq |
|
232 |
|
{ |
|
233 |
|
update: { |
|
234 |
|
_index: "wmmsdb", |
|
235 |
|
_type: "record", |
|
236 |
|
_id: .id[0] |
|
237 |
|
} |
|
238 |
|
}, |
|
239 |
|
{ |
|
240 |
|
doc: { |
|
241 |
|
sound: { |
|
242 |
|
freq: { |
|
243 |
|
IPR: .["freq.IPR"][0], |
|
244 |
|
M: .["freq.M"][0], |
|
245 |
|
P1: .["freq.P1"][0], |
|
246 |
|
P2: .["freq.P2"][0] |
|
247 |
|
}, |
|
248 |
|
time: { |
|
249 |
|
IPR: .["time.IPR"][0], |
|
250 |
|
M: .["time.M"][0], |
|
251 |
|
P1: .["time.P1"][0], |
|
252 |
|
P2: .["time.P2"][0] |
|
253 |
|
} |
|
254 |
|
} |
|
255 |
|
} |
|
256 |
|
} |
|
257 |
|
``` |
|
258 |
|
|
|
259 |
|
Finally the data is added to ElasticSearch using the following command. |
|
260 |
|
|
|
261 |
|
```bash |
|
262 |
|
jq --raw-output --compact-output -f acoustat.jq *.acoustat.json | curl -s -H "Content-Type: application/x-ndjson" -XPOST localhost:9200/_bulk --data-binary "@-" | jq .took |
|
263 |
|
``` |
|
264 |
|
|
|
265 |
|
#### Results and Discussion |
|
266 |
|
|
|
267 |
|
With small changes on the existing Web UI the acoustic features are now available as search filters, but how can they be used during research? |
|
268 |
|
|
|
269 |
|
The [1992 Fristrup and Watkins report][1912/3055] outlines the design of the features. |
|
270 |
|
|
|
271 |
|
> Each statistic was designed to emphasize particular parameters of animal sounds that we recognized as important for distinguishing species. |
|
272 |
|
|
|
273 |
|
The paper further explains a correlation test with a subset of 200 sounds clips, to see if species could be distinguished using the statistical features. |
204 |
274 |
|
|
|
275 |
|
It further notes: |
205 |
276 |
|
|
|
277 |
|
> The short-term bandwidth statistics in Table 5, the aggregate bandwidth statistics in |
|
278 |
|
> Table 6, and the center frequency statistics of Table 7 were the most diagnostic for this set |
|
279 |
|
> of sound sequences. They apparently separated the sounds of different species. |
206 |
280 |
|
|
207 |
281 |
|
|
208 |
282 |
#### References |
#### References |
|
... |
... |
The output JSON file looks like this: |
211 |
285 |
|
|
212 |
286 |
[1912/3055]: https://hdl.handle.net/1912/3055 |
[1912/3055]: https://hdl.handle.net/1912/3055 |
213 |
287 |
[seewave.acoustat]: http://rug.mnhn.fr/seewave/HTML/MAN/acoustat.html |
[seewave.acoustat]: http://rug.mnhn.fr/seewave/HTML/MAN/acoustat.html |
|
288 |
|
[R]: https://www.r-project.org/ |
|
289 |
|
[make]: https://www.gnu.org/software/make/ |
|
290 |
|
[elastic.bulk]: https://www.elastic.co/guide/en/elasticsearch/reference/1.7/docs-bulk.html |
214 |
291 |
|
|
215 |
292 |
### 20. July 2018: First release |
### 20. July 2018: First release |
216 |
293 |
|
|