
Human-Like, Real-Time AI Scheduler Powered by Amazon Nova Sonic
Summary
ScienceSoft's HIPAA-compliant AI voice agent automates healthcare scheduling with human-like conversations and the first-ever integration of the Amazon Nova Sonic speech-to-speech model with the LiveKit Media Server. The agent handles same-call confirmations with no backend latency and aims at a 50% reduction in scheduling costs.
Where Existing AI Voice Agents Fall Short
The majority of appointment scheduling AI voice agents that are currently on the market rely on the careful balancing of several models. One model converts speech to text, another interprets the language and generates a response, and a third one transforms the text back into speech. This complex setup frequently results in robotic-sounding responses and slow-moving, fragmented conversations. Such AI assistants don't take into account natural pauses when a patient hesitates and can't handle polite turn-taking when speaking. They also don't adapt their speech to variations in speaking style, prosody, and tone. As a result, patient interactions become awkward, impersonal, and prone to mistakes.
What's more, the limitations of the existing AI assistants can often complicate real-time data exchange with hospital systems, such as scheduling systems, EHRs, and practice management software (PMS). Consequently, some AI appointment scheduling assistants may rely on manual entry or batch processes to update scheduling details and can't access physician availability in real time during the call.
ScienceSoft's AI Scheduling Voice Agent to Overcome Current Limitations
To address the drawbacks of human-operated call scheduling and AI voice agents with multi-step, delay-prone architectures, ScienceSoft developed a fully customizable HIPAA-compliant AI voice assistant for appointment scheduling in healthcare. Built on Amazon's Nova Sonic speech-to-speech model, the agent facilitates natural conversations with patients, allowing them to book, reschedule, and cancel appointments. Healthcare providers can also use it to make outbound rescheduling calls.
The AI voice agent ensures seamless real-time data exchange with hospital systems, such as EHR, CRM, and PMS via FHIR-based APIs.
The AI assistant features the first-ever integration of Amazon Nova Sonic with the LiveKit Media Server. Nova Sonic is the first model to support bidirectional streaming through a single API. This enables listening and responding in real time without delays or awkward pauses. The model also supports function calling, allowing the AI assistant to interact directly with hospital systems through APIs during the call. For instance, it can access the hospital schedule, check provider availability, and update appointment data in real-time.
The agent can potentially reduce appointment booking time by 40%, cut call abandonment rates by 30%, and lower operational costs by at least 50%. As it can handle multiple calls simultaneously, it is expected to process 70% more calls per hour than a patient service representative.
AI Voice Agent Capabilities in Brief
The voice assistant's capabilities can be adapted to the specific needs of a particular healthcare provider. For example, the assistant may be enhanced to offer multilingual scheduling, suggest to patients the option to schedule a virtual consultation, or provide pre-appointment instructions for a specific type of appointment, e.g., a fasting blood test.
Core capabilities
Inbound calls
The assistant answers incoming patient calls and processes booking, rescheduling, cancellation, and clarification requests.
Outbound calls
The AI voice agent can make outbound calls to hospital patients triggered by system events or rules to confirm, cancel, or reschedule appointments.
Identity verification
To guarantee secure access to appointment details, the assistant verifies patient identity during both inbound and outbound calls. If the identity verification fails, the agent escalates to the patient service representative.
Real-time provider availability check
The assistant queries integrated systems, like PMS or EHR, to verify provider availability in real time.
Appointment data management
The assistant has direct access to appointment data in the integrated hospital systems and can perform create, read, update, and delete operations as authorized according to the Role-Based Access Control.
AI Voice Agent Workflow in a Clinical Setting
When receiving a patient's call, the AI assistant leverages Nova Sonic's natural language capabilities to ask if they would like to make, reschedule, or cancel an appointment. After determining the nature of the patient's request, the assistant verifies the patient's identity by asking their name, birthdate, and the last four digits of their Social Security number to compare them against the information in the integrated hospital systems, such as CRM, EHR, or practice management software (PMS).
If the identity check fails, the assistant forwards the call to a patient service representative. If the identity is successfully verified, the assistant inquires about the patient's preferred physician (by specialty or name), the appointment date, and the preferred time. Then, it checks the slot's availability in real time within the integrated hospital scheduling system. If the requested time slot is available, the assistant schedules the appointment, saves the appointment information in the integrated system, and generates a unique booking ID for the patient. Finally, the agent reads out the booking ID to the patient and confirms that a doctor will be waiting for the patient at the scheduled time at the clinic.
Additionally, the agent can make outbound calls to the patients who have given their express consent to receive such calls to confirm or reschedule appointments. Thanks to FHIR integration with hospital systems, the AI assistant can dynamically create, read, update, and delete appointment data across patient records, appointment calendars, and room availability trackers.
AI Voice Agent Architecture
Our principal architects provided a reference architecture of the AI healthcare appointment scheduling voice agent. The specific technologies, tools, and services can be adapted to the requirements and budget of each care provider. For example, the underlying architecture supports integration with any of the three world's most powerful Agentic AI Models: Amazon Nova Sonic, OpenAI's GPT-4o, Google's Gemini 2.5 Pro.
The voice agent is deployed in a HIPAA-compliant Amazon Virtual Private Cloud (VPC), integrates with hospital systems (e.g., EHR, CRM, PMS) via FHIR-based APIs, and routes all interactions via LiveKit for real-time processing.
When a patient makes the call, it is directed to a dedicated number via the Amazon Chime SDK, which acts as the telephony provider and handles the initial telephony connection. From there, the call is forwarded to the system's LiveKit Media Server using SIP trunking.
The Media Server creates a "room" for each call, adds the patient and the voice agent as participants, and merges their audio streams to create a real-time conversational session. The server records the dialogue and stores it in an AWS S3 bucket to generate transcripts, enable compliance checks, and provide data for patient engagement analytics and ongoing model improvement.
All conversations are passed through the Amazon Bedrock Guardrails AI firewall. It continuously monitors the dialogue for anomalous or malicious behavior, verifies a patient's intent (e.g., checks if the patient's request is within the assistant's authorized scope), and enforces HIPAA compliance by filtering agent responses to prevent unauthorized disclosure of protected health information (PHI).
The system supports horizontal scaling. Each voice agent instance handles one call at a time, but instances can be spawned dynamically (e.g., 100, 200, or more) to manage high call volumes concurrently.
The agent operates according to the prompt-style textual instructions that define its behavior. Here is an example of such an instruction:
Begin by saying, "Thank you for calling Hospital XYZ".
Introduce yourself and ask how you can help them today: whether they are calling to book, reschedule, or cancel an appointment.
If the patient requests to book, reschedule, or cancel an appointment, inform the patient that the call may contain protected health information, and ask them to ensure they are in a private setting. Then, begin verifying their identity.
Ask for their name first. After they respond, repeat what they said back to them.
Each action, such as booking, rescheduling, cancellation, or data updates, is handled by a dedicated tool. For example:
The verify_identity tool queries the EHR to verify the patient's identity.
The provider_available_slots tool retrieves real-time doctor availability.
The book_appointment tool schedules the appointment directly in the EHR.
How We Ensure AI Voice Agent’s Security and HIPAA Compliance
Below, we list the core components that uphold the security and compliance framework of the solution:
Core Security Components
The system operates within an audited, encrypted, and access-controlled Amazon Virtual Private Cloud (VPC).
Amazon Bedrock Guardrails acts as an AI firewall that tracks and filters all interactions between the agent and patients in real time to prevent unauthorized disclosure of sensitive data and detect suspicious input, such as prompt injection attempts.
AWS CloudTrail logs all API calls and keeps detailed audit trails of all interactions, as well as system and user actions.
Amazon CloudWatch monitors performance metrics and sends automated notifications to the security team in case of suspicious changes.
AWS Security Hub evaluates the solution's security controls against NIST SP 800-53 requirements and provides recommendations for mitigating identified security and compliance gaps.
Amazon Macie uses machine learning to discover, classify, and protect sensitive data stored in S3 buckets. It can detect buckets that aren't encrypted, are open to the public, or are shared with unauthorized users, and then carry out remediation steps, like alerting security teams.
VPC Endpoints ensure that all communications between the agent and AWS services stay in the private AWS network to eliminate the chance of public internet exposure and reduce attack risks.
Data Exchange Protocols and Security Controls
FHIR is applied to securely integrate the agent with the healthcare systems, such as EHR, CRM, and PMS.
Identity verification is enforced by the assistant querying a patient's name, date of birth, and the last four digits of their Social Security number (SSN). If verification fails, the call is automatically redirected to a patient service representative.
Role-based access control mechanisms ensure that the AI assistant performs scheduling without direct access to clinical or billing data.
Data encryption is enforced both at rest (AES-256) and in transit (TLS 1.2), covering all stored data and voice communications.
Results
Unlike the majority of AI voice agents, which rely on multiple models and thus produce slow and robotic-sounding responses, ScienceSoft's healthcare appointment booking assistant supports natural, human-like conversations tailored to the tone, rhythm, emotions, speed, and speaking style of a patient.
The agent is estimated to reduce appointment booking time by 40%, cut call abandonment rates by 30%, and lower operational costs by at least 50%. As it can handle multiple calls simultaneously, it is expected to process 70% more calls per hour than a patient service rep.
Technologies and Tools
Technologies: AI, GenAI, LLM, Cloud.
Tools and platforms: Amazon Bedrock, Amazon Nova Sonic, Amazon Bedrock Guardrails, AWS Polly, AWS Transcribe, Amazon Elastic Container Service (Amazon ECS), VPC Endpoints, LiveKit SDK, LiveKit Media Server, Amazon Chime SDK, AWS CloudTrail, Amazon CloudWatch, AWS Security Hub, Amazon Macie.