meta-offline-voice-agent/README.md

   1 # Offline Speech Recognition and Command Execution
   2 The `meta-offline-voice-agent` is an AGL Layer that enables Offline Speech Recognition and Command Execution capabilities for Automotive Grade Linux.
   3
   4 ## Table of Contents
   5 - [Introduction](#introduction)
   6 - [Layer Status](#layer-status)
   7 - [Working Features](#working-features)
   8 - [Testing Features on AGL](#testing-features-on-agl)
   9     - [Build Layer](#build-layer)
  10     - [Test Vosk](#test-vosk)
  11     - [Test Snips](#test-snips)
  12 - [Supported Targets](#supported-targets)
  13 - [Maintainers](#maintainers)
  14
  15 ## Introduction
  16 The `meta-offline-voice-agent` layer integrates the Vosk API and Snips (Inference Only) to provide offline speech recognition and command execution for Automotive Grade Linux. The layer is based on the Kaldi ASR Toolkit, which allows for accurate and efficient speech recognition in the AGL platform, and Snips which provides us with a lightweight Natural Language Intent Engine.
  17
  18 ## Layer Status
  19 **Status**: *WIP (Work In Progress)*
  20
  21 This layer is currently in development and integrates the Vosk, Snips (Inference Only), and RASA libraries. Speech recognition has been verified using test scripts from the [vosk-api python examples](https://github.com/alphacep/vosk-api/tree/master/python/example). Additionally, Snips has undergone testing and verification to ensure its functionality. Currently, work on integrating RASA and command execution is still in progress.
  22
  23 ## External Dependencies
  24 This layer does not have any external layer dependency.
  25
  26 ## Working Features
  27 The following features are currently working in the `meta-offline-voice-agent` layer:
  28 - [Vosk API (Python)](https://github.com/alphacep/vosk-api/tree/master/python)
  29 - [Vosk Websocket Server](https://github.com/alphacep/vosk-server/tree/master/websocket)
  30 - [Snips Inference](https://github.com/malik727/snips-inference-agl)
  31
  32 ## Testing Features on AGL
  33
  34 ### Build Layer
  35 In order to test the features of this layer you first need to build it as part of your final AGL image. First of all ensure that you have all the external layer dependencies included. Then you can use the following set of commands to initialize and build this layer into the `agl-demo-platform` qemux86_64 image:
  36 ```shell
  37 $ source master/meta-agl/scripts/aglsetup.sh -m qemux86-64 -b build-master agl-demo agl-devel agl-offline-voice-agent
  38 $ source agl-init-build-env
  39 $ bitbake agl-demo-platform
  40 ```
  41
  42 The build can take anywhere from 6 hours to 24 hours or even more depending upon compute power of your machine. After the build completes you can use the following command to boot into your AGL image: (you need to install QEMU if not already for the command to work)
  43 ```shell
  44 $ runqemu tmp/deploy/images/qemux86-64/agl-demo-platform-qemux86-64.qemuboot.conf kvm serialstdio slirp publicvnc audio
  45 ```
  46
  47 ### Test Vosk
  48 (**Not Recommended**) The simplest way to test Vosk API is by using the following command:
  49 ```shell
  50 $ ptest-runner python3-vosk-api
  51 ```
  52
  53 In order for the above command to work you need to turn on `ptests` by adding the following lines to your `local.conf` that can be found at `meta-agl-devel/templates/feature/agl-offline-voice-agent/50_local.conf.inc`:
  54 ```shell
  55 DISTRO_FEATURES:append = " ptest"
  56 EXTRA_IMAGE_FEATURES += "ptest-pkgs"
  57 ```
  58
  59 The above method may be the easiest one but it's not recommended because `ptests` increase the image build times by a substantial amount. You can look into the official [vosk-api docs](https://alphacephei.com/vosk/install) for usage and other ways of testing.
  60
  61 ### Test Snips
  62 In order to test the Snips NLU Intent Engine you can use the sample [pre-trained model](https://github.com/malik727/snips-model-agl), by default it automatically gets built into the target image when you include this layer. To perform inference using this model you can run the following command inside your target image:
  63 ```shell
  64 $ snips-inference parse /usr/share/nlu/snips/model/ -q "your command here"
  65 ```
  66
  67 This is just a sample model and may not be able to handle all types of commands. You can always train your own intent engine model using your custom dataset, for more details on how to do that you can look into the README files of [snips-sdk-agl](https://github.com/malik727/snips-sdk-agl), [snips-model-agl](https://github.com/malik727/snips-model-agl), and [snips-inference-agl](https://github.com/malik727/snips-inference-agl).
  68
  69 ## Supported Targets
  70 Currently, the following targets are fully supported:
  71 - QEMU x86-64 (work in progress)
  72
  73 ## Maintainers
  74 - Malik Talha <talhamalik727x@gmail.com>
  75 - Aman Arora <aman.arora9848@gmail.com>
  76