forked from Github/frigate
* Initial audio classification model implementation * fix mypy * Keep audio labelmap local * Cleanup * Start adding config for audio * Add the detector * Add audio detection process keypoints * Build out base config * Load labelmap correctly * Fix config bugs * Start audio process * Fix startup issues * Try to cleanup restarting * Add ffmpeg input args * Get audio detection working * Save event to db * End events if not heard for 30 seconds * Use not heard config * Stop ffmpeg when shutting down * Fixes * End events correctly * Use api instead of event queue to save audio events * Get events working * Close threads when stop event is sent * remove unused * Only start audio process if at least one camera is enabled * Add const for float * Cleanup labelmap * Add audio icon in frontend * Add ability to toggle audio with mqtt * Set initial audio value * Fix audio enabling * Close logpipe * Isort * Formatting * Fix web tests * Fix web tests * Handle cases where args are a string * Remove log * Cleanup process close * Use correct field * Simplify if statement * Use var for localhost * Add audio detectors docs * Add restream docs to mention audio detection * Add full config docs * Fix links to other docs --------- Co-authored-by: Jason Hunter <hunterjm@gmail.com>
64 lines
2.0 KiB
Markdown
64 lines
2.0 KiB
Markdown
---
|
|
id: audio_detectors
|
|
title: Audio Detectors
|
|
---
|
|
|
|
Frigate provides a builtin audio detector which runs on the CPU. Compared to object detection in images, audio detection is a relatively lightweight operation so the only option is to run the detection on a CPU.
|
|
|
|
## Configuration
|
|
|
|
Audio events work by detecting a type of audio and creating an event, the event will end once the type of audio has not been heard for the configured amount of time. Audio events save a snapshot at the beginning of the event as well as recordings throughout the event. The recordings are retained using the configured recording retention.
|
|
|
|
### Enabling Audio Events
|
|
|
|
Audio events can be enabled for all cameras or only for specific cameras.
|
|
|
|
```yaml
|
|
|
|
audio: # <- enable audio events for all camera
|
|
enabled: True
|
|
|
|
cameras:
|
|
front_camera:
|
|
ffmpeg:
|
|
...
|
|
audio:
|
|
enabled: True # <- enable audio events for the front_camera
|
|
```
|
|
|
|
If you are using multiple streams then you must set the `audio` role on the stream that is going to be used for audio detection, this can be any stream but the stream must have audio included.
|
|
|
|
:::note
|
|
|
|
The ffmpeg process for capturing audio will be a separate connection to the camera along with the other roles assigned to the camera, for this reason it is recommended that the go2rtc restream is used for this purpose. See [the restream docs](/configuration/restream.md) for more information.
|
|
|
|
:::
|
|
|
|
```yaml
|
|
cameras:
|
|
front_camera:
|
|
ffmpeg:
|
|
inputs:
|
|
- path: rtsp://.../main_stream
|
|
roles:
|
|
- record
|
|
- path: rtsp://.../sub_stream # <- this stream must have audio enabled
|
|
roles:
|
|
- audio
|
|
- detect
|
|
```
|
|
|
|
### Configuring Audio Events
|
|
|
|
The included audio model has over 500 different types of audio that can be detected, many of which are not practical. By default `bark`, `speech`, `yell`, and `scream` are enabled but these can be customized.
|
|
|
|
```yaml
|
|
audio:
|
|
enabled: True
|
|
listen:
|
|
- bark
|
|
- scream
|
|
- speech
|
|
- yell
|
|
```
|