Virtual Agents Transcript and Call Summary

Retrieves Virtual Agent transcripts and the Virtual Agent transfer summary when the call is transferred from a virtual agent to a human agent. For more information please refer to this guide. Find the proto definition here.

Services

AiInsight

Service to subscribe to, to get AI Insights

SERVER
rpc StreamingInsightServing(StreamingInsightServingRequest) returns (stream StreamingInsightServingResponse)
Server side streaming gRPC Call that takes conversation id and agent details as input and returns streaming insights for that conversation
UNARY
rpc InsightServing(InsightsServingRequest) returns (InsightsServingResponse)
This is a unary gRPC Call that takes conversation id and agent details as input and returns insights for that conversation

Messages

agentId
string

metadata
google.protobuf.Struct
Call Transfer Metadata.

content
string
Content.
callInsightType
CallInsightType
Call Insight Type.

seconds
int64
Signed seconds of the span of time. Must be from -315,576,000,000 to +315,576,000,000 inclusive. Note: these bounds are computed from: 60 sec/min * 60 min/hr * 24 hr/day * 365.25 days/year * 10000 years.
nanos
int32
Signed fractions of a second at nanosecond resolution of the span of time. Durations less than one second are represented with a 0 seconds field and a positive or negative nanos field. For durations of one second or more, a non-zero value for the nanos field must be of the same sign as the seconds field. Must be from -999,999,999 to +999,999,999 inclusive.

metadata
google.protobuf.Struct
Call Transfer Metadata.

event_type
ExitEvent.EventType
Event Type.
name
string
Optional: To be used for the custom event.
metadata
google.protobuf.Struct
Optional: Map to pass the custom params.

conversationId
string
Required. Conversation ID for which insights are needed. The subscription will start listening to any insights for this conversation across multiple legs( IVR, Caller, Agent) and services (Transcription, Agent Assist).
orgId
string
Required.Control Hub OrgID for the org, this conversation belongs to. The Access token should have authorization for this Org.
historicalTranscripts
bool
Is historical transcripts from start of the conversation required? Default: false.
historicalVirtualAgent
bool
Is virtual agent from start of the conversation required? Default: false.
agentDetails
AgentDetails
Required. AgentDetails from where the call is initiated.
messageId
string
Sets the message id for the request , this uniquely identifies each request .

orgId
string
Org Identifier (control hub) for which the insights need to be delivered.
conversationId
string
Identifier for the Conversation. Equivalent to Call ID, CallGUID etc.
roleId
string
Identifier for the individual leg, based on the party. GUID.
utteranceId
string
Identifier for a given utterance. The same utterance ID will be published for the transcript utterance and the insights generated from it.
role
InsightServingResponse.Role
Role specifying IVR, Caller or Agent.
insightType
InsightServingResponse.ServiceType
Type of insight : ASR, Agent Assist etc.
insightProvider
InsightServingResponse.ServiceProvider
Service Provider who produced this insight.
publishTimestamp
int64
Epoch Timestamp when this insight record was created/published. This field is always available, can be used for sorting messages by time.
startTimestamp
int64
Start time and end time corresponds to the speech interval to which this insight belongs. Epoch Timestamp. These are optional fields, not always available.
endTimestamp
int64
isFinal
bool
Whether the insight is final or intermediate. Intermediate results will be overridden by the final result that follows them.
messageId
string
message id.
configId
string
languageCode
string
responseContent
ResponseContent
Content of the insight. This will vary based on the type of insight.

conversationId
string
Required. Conversation ID (in combination with the messageId, if provided)for which insights are needed. The subscription will start listening to any insights for this conversation (along with messageId if provided) across multiple legs( IVR, Caller, Agent) and services (Transcription, Agent Assist).
messageId
string
Optional. if messageId is provided then the insights are fetched with the combination of conversationId. The subscription will start listening to any insights for this messageID along with the conversationId field across multiple legs( IVR, Caller, Agent) and services (Transcription, Agent Assist).
orgId
string
Required.Control Hub OrgID for the org, this conversation belongs to. The Access token should have authorization for this Org.
insightType
InsightsServingRequest.InsightType

conversationId
string
Required. Conversation ID (in combination with the messageId, if provided)for which insights are needed. The subscription will start listening to any insights for this conversation (along with messageId if provided) across multiple legs( IVR, Caller, Agent) and services (Transcription, Agent Assist).
messageId
string
Optional. if messageId is provided then the insights are fetched with the combination of conversationId. The subscription will start listening to any insights for this messageID along with the conversationId field across multiple legs( IVR, Caller, Agent) and services (Transcription, Agent Assist).
orgId
string
Required.Control Hub OrgID for the org, this conversation belongs to. The Access token should have authorization for this Org.
startTimestamp
int64
Start time and end time corresponds to the speech interval to which this insight belongs. Epoch Timestamp. These are optional fields, not always available.
endTimestamp
int64
configId
string
languageCode
string
insightProvider
InsightsServingResponse.ServiceProvider
Service Provider who produced this insight.
responseContent
ResponseContent
Content of the insight. This will vary based on the type of insight.

name
string
Name of the Intent.
display_name
string
Display name of the Intent.
parameters
google.protobuf.Struct
Parameters of an Intent, filled / not filled.
match_confidence
float
Match Confidence.

reply_text
string
Response in text. This will be used for Virtual Agent Transcript.
intent
Intent
Intent detected from the last utterance.
agent_transfer
AgentTransfer
Sent when the call is transferred to Agent.
end_virtual_agent
EndVirtualAgent
Call Ended.
input_text
string
user input uttered by caller.
exit_event
ExitEvent
Exit Event to return the control back to the calling flow.

rawContent
string
Placeholder for any other types. Not returned unless stated.
recognitionResult
StreamingRecognitionResult
For Service Type = TRANSCRIPTION.
virtualAgentResult
NLU
For Service Type = VIRTUAL_AGENT.
callInsightsResult
CallInsightsResult
For Service Type = CALL_INSIGHTS.

transcript
string
Output only. Transcript text representing the words that the user spoke.
confidence
float
Output only. The confidence estimate between 0.0 and 1.0. A higher number indicates an estimated greater likelihood that the recognized words are correct. This field is set only for the top alternative of a non-streaming result or, of a streaming result where is_final=true. Not yet supported.
words
WordInfo
Output only. A list of word-specific information for each recognized word. Note: When enable_speaker_diarization is true, you will see all the words from the beginning of the audio.

insightServingRequest
InsightServingRequest

insightServingResponse
InsightServingResponse

alternatives
SpeechRecognitionAlternative
Output only. May contain one or more recognition hypotheses (up to the maximum specified in max_alternatives). These alternatives are ordered in terms of accuracy, with the top (first) alternative being the most probable, as ranked by the recognizer.
is_final
bool
Output only. If false, this StreamingRecognitionResult represents an interim result that may change. If true, this is the final time the speech service will return this particular StreamingRecognitionResult, the recognizer will not return any further hypotheses for this portion of the transcript and corresponding audio.
result_end_time
Duration
Output only. Time offset of the end of this result relative to the beginning of the audio.
channel_tag
int32
For multi-channel audio, this is the channel number corresponding to the recognized result for the audio from that channel. For audio_channel_count = N, its output values can range from '1' to 'N'.
language_code
string
Output only. The BCP-47 language tag of the language in this result. This language code was detected to have the most likelihood of being spoken in the audio.
has_applied_recording_offsets
bool
Whether or not recording offsets have been applied to the word alignment values. Otherwise the word alignment start and end times are only relative within the utterance.
speaker_ids
uint32
Zero or more integers representing the speaker ID of this result. This is usually derived from the speaker integers that are passed in the streaming request.
last_packet_metrics_unix_timestamp_ms
int64
The unix time in milliseconds which was received from the client for the StreamingRecognizeRequest that was last used to complete this utterance.
message_type
string
message type.
response_event
StreamingRecognitionResult.OutputEvent
Event Based on user utterances.
role
StreamingRecognitionResult.Role

start_time
Duration
Output only. Time offset relative to the beginning of the audio, and corresponding to the start of the spoken word. This field is only set if enable_word_time_offsets=true and only in the top hypothesis. This is an experimental feature and the accuracy of the time offset can vary.
end_time
Duration
Output only. Time offset relative to the beginning of the audio, and corresponding to the end of the spoken word. This field is only set if enable_word_time_offsets=true and only in the top hypothesis. This is an experimental feature and the accuracy of the time offset can vary.
word
string
Output only. The word corresponding to this set of information.

Enums

UNSPECIFIED
0
VA_CALL_SUMMARY
1

UNSPECIFIED
0
VA_CALL_END
1
AGENT_TRANSFER
2
CUSTOM
3

IVR
0
CALLER
1

DEFAULT
0
CISCO
1
GOOGLE
2
NUANCE
3

DEFAULT_TRANSCRIPTION
0
CALL_INSIGHTS
5

DEFAULT_TRANSCRIPTION
0
CALL_INSIGHTS
5

DEFAULT
0
CISCO
1
GOOGLE
2
NUANCE
3

EVENT_UNSPECIFIED
0
EVENT_START_OF_INPUT
1
Triggers when user utter the first utterance in Voice Input mode or First DTMF is pressed in DTMF Input mode. This event to be used to BargeIn the prompt based on prompt barge-in flag. The event will be sent only if the current prompt being played is bargein enabled or prompt playing is complete.
EVENT_END_OF_INPUT
2
Sent when user utterance Voice / DTMF is complete.
EVENT_NO_MATCH
3
Sent when utterance did not match any of the accepted input.
EVENT_NO_INPUT
4
Sent when no audio received with in the expected timeframe.

UNDEFINED
0
Role - Undefined.
CALLER
1
Role - Caller.

Response

// No response received yet. Click the Run button to see a response.

Response

// No response received yet. Click the Run button to see a response.

AiInsight

rpc StreamingInsightServing(StreamingInsightServingRequest) returns (stream StreamingInsightServingResponse)

rpc InsightServing(InsightsServingRequest) returns (InsightsServingResponse)

AgentDetails

AgentTransfer

CallInsightsResult

Duration

EndVirtualAgent

ExitEvent

InsightServingRequest

InsightServingResponse

InsightsServingRequest

InsightsServingResponse

Intent

NLU

ResponseContent

SpeechRecognitionAlternative

StreamingInsightServingRequest

StreamingInsightServingResponse

StreamingRecognitionResult

WordInfo

CallInsightType

ExitEvent.EventType

InsightServingResponse.Role

InsightServingResponse.ServiceProvider

InsightServingResponse.ServiceType

InsightsServingRequest.InsightType

InsightsServingResponse.ServiceProvider

StreamingRecognitionResult.OutputEvent

StreamingRecognitionResult.Role