AI - MSPEAK Progress Document
Update: 22.02.2023 - Version 1.0 API Document | Benchmark result
Update: 10.03.2023 - Summarize version 1.0 - Slide
Update: 15.03.2023 - Planing version 2.0 - Planning version 2.0
Update 25.05.2023 - Diagram v2.0 PDF
Update 25.05.2023 - Kế hoạch đánh giá V2 - Hướng dẫn SD V2 - Doc
Update 25.05.2023 - Planning version 3.0 - Slide
Update 11.10.2023 - Explain API V3 - Doc
Offline Plan Workflow
Update 21.01.2023 - Offline flow M-Speak
Infra Mspeak
Refer: https://docs.kolena.io/metrics/wer-cer-mer/
Integration:
- API method
Property | Value |
---|---|
URL | https://app.monkeyenglish.net/mspeak/v3/score |
Method | POST |
Header | Bearer {{JWT Token}} |
Body | Form |
- WebSocket method
This guide explains how to use a WebSocket client to stream audio data from a file to a WebSocket server and handle server responses.
Overview
This WebSocket client allows you to:
Stream audio data from a file to a WebSocket server.
Receive and process responses from the server.
Send additional data to the server based on the responses
Domain dev: "wss://ai.monkeyenglish.net/ws/v2/{device_id}"
Workflow app + server
Note:
- Khi dưới app gửi audio lên được server xác nhận là im lặng rồi đã được chấm điểm mà vẫn dưới 50đ. Mà vẫn tiếp tục im lặng thì server sẽ k chấm nữa. Server chỉ chấm lại khi hết im lặng và có nói thêm cái gì đó.
- Khi user chủ động tắt, hoặc hết thời gian ghi âm thì vẫn như luồng bình thường.
Event Tracking App
-
Overview
This document provides instructions on how to write events to a client push stream using Amazon Kinesis. The events are structured with specific fields that include timing metrics, identifiers, version information, error messages, and timestamps. -
Event Structure
Each event is represented by the following fields:
- time_first_push: Time from open mic until sent first bytes
- time_handshake: The time taken for the handshake process, represented as a float32.
- time_response: The time taken to receive a response, represented as a float32.
- profile_id: The unique identifier for the user profile, represented as a string.
- user_id: User Id
- mode: Enum [online | offline] or [0, 1]
- device_id: The unique identifier for the device, represented as a string.
- app_version: The version of the application, represented as a string.
- error: Any error encountered during the process, represented as a string.
- request_id: A unique identifier for the request, represented as a string.
- created_at: The timestamp when the event was created, represented
as a timestamp with second-level precision. - event_name: mspeak_websocket
- platform: string
- score: score for session
- Prerequisites
- Kinesis: data_stream= app_ai_log
Message Push Stream Process
Step 1: Connect to WS: ws/v3/device_id
Khi connect thành công sẽ nhận được msg
{"type": 3, "msg": "Connect to server successfully"}
Step 2: Push data bytes data of array audio
Step 3: Khi muốn score cho data từ khi record đến bây giờ gửi
{"type": 1, "data": {data giống v2 nằm trong daya}}
Step 4: Kết quả nhận về sẽ giống như cũ
Step5: Gửi msg lên để clear data và chuẩn bị cho session mới
{"type": 2}