How a Pharmacy Benefits Firm Cleared FDA PDFs in 10 Weeks

Arrive Health's pharmacy specialists offloaded FDA PDF review to a RAG-based AI that drafts drug equivalency recommendations — shipped in 10 weeks at 95%+ accuracy with human-in-the-loop validation.

10 weeks

Built and shipped in 10 weeks

4–8 weeks

Implementation Time

Ross Hale

Co-founder & CEO

Artium

Connect ↓

The Challenge

Arrive Health, a pharmacy benefits manager, relied on highly manual, specialist-intensive processes to review FDA drug sheets and identify cost-equivalent drug therapies. Specialists would read complex pharmaceutical PDFs, compare drug equivalencies, and manually enter data into a rules engine — a slow, expensive process that limited how quickly the company could surface cost savings for insurers, pharmacies, and patients. Scaling this review process without proportionally scaling headcount was the core challenge.

What They Built

Artium partnered with Arrive Health and AWS to design and build a RAG-based generative AI system that preprocesses FDA pharmaceutical PDFs, extracts and structures drug information, and generates drug equivalency recommendations for human specialist review. The system was purpose-built with Artium's proprietary CAT (Continuous Alignment Testing) framework, which continuously monitors AI output reliability at the developer, overnight, and production levels to prevent hallucinations and ensure consistent performance. Built in approximately 10 weeks of hands-on development, the solution replaced the upstream manual processing step while keeping human specialists in the loop for final validation.

Artium began by designing purpose-built preprocessing pipelines capable of ingesting complex pharmaceutical PDFs — structured documents with regulatory language that standard parsers struggled to handle accurately. Drug information was extracted, normalized, and structured for downstream retrieval. The RAG architecture was then tuned specifically to the domain, with semantic chunking aligned to pharmaceutical data patterns and a hybrid retrieval layer optimized for both structured queries and open-ended equivalency questions. To manage hallucination risk in a regulated healthcare environment, Artium developed its proprietary Continuous Alignment Testing (CAT) framework — a multi-stage reliability system that monitored AI outputs at the developer level, overnight in automated test suites, and in production. Rather than treating reliability as a final QA pass, CAT was embedded throughout the build cycle. Human specialists were kept in the loop for final validation, ensuring the system served as an accelerant to expert review rather than a replacement. The entire solution was delivered in approximately 10 weeks on AWS, compressing what had previously been projected as a multi-year initiative.

AI Role

The AI preprocesses FDA pharmaceutical PDFs, extracts and structures drug information from complex regulatory documents, and generates drug equivalency recommendations for human specialist review. It operates within a RAG architecture monitored continuously by Artium's proprietary CAT framework, which tests AI output reliability at the developer, overnight, and production levels to prevent hallucinations and ensure consistent recommendation quality.

AI Model

Custom / proprietary

Infrastructure

AWS (cloud infrastructure and hosting) • RAG architecture (custom-built) • Proprietary CAT (Continuous Alignment Testing) framework • FDA pharmaceutical PDF ingestion pipeline

Integration Points

AWS services integrated with custom RAG retrieval layer • PDF preprocessing pipeline feeding structured drug data into vector store • CAT framework connected across developer, CI/CD, and production environments • Human specialist review interface receiving AI-generated recommendations

Impact

Proof-of-Concept to Working Solution in 10 Weeks

Artium delivered a working AI-powered drug equivalency recommendation system for Arrive Health in approximately 10 weeks of hands-on development — compressing what would previously have been a multi-year initiative into a single quarter.

Human Specialists Freed from Upstream Manual Processing

By automating the preprocessing of FDA drug sheets and generating equivalency recommendations before specialist review, the system significantly reduced manual data processing burden — allowing specialists to focus their time on validation and high-judgment decisions rather than initial data extraction.

CAT Framework Enables Reliable AI in High-Stakes Healthcare Context

Artium's Continuous Alignment Testing framework ensured the AI system maintained required accuracy thresholds (~95%+ for effective specialist use) in a healthcare context with strict LLM refusal challenges — enabling Arrive Health to roll out to a beta group with confidence in system reliability.

Technology Utilized

Implementation Complexity

Time Savings

Cost Reduction

Headcount Avoidance

Risk & Compliance

The solution required purpose-built preprocessing pipelines for complex pharmaceutical PDFs, a custom RAG architecture tuned to domain-specific regulatory language, and a proprietary continuous alignment testing framework to manage hallucination risk in a regulated healthcare context — all delivered in approximately 10 weeks. The highly regulated environment, the specialised nature of pharmaceutical data, and the reliability requirements significantly increase implementation complexity beyond a standard RAG deployment.

Best Fit For

Fortune 500 enterprises and growth-stage companies (Series A and up) in highly regulated industries — especially healthcare, financial services, and media/entertainment — that need to build custom AI-native software, move quickly from POC to production, and require reliability frameworks for non-deterministic AI outputs.

Ross Hale

Co-founder & CEO

Artium

As CEO of Artium, an industry-leading AI Software Development consultancy, I help enterprises build safe, reliable, market-winning AI applications.

Connect with the Expert

Industry:

Healthcare & Life Sciences

Business Function:

Operations

Legal & Compliance

Company Size:

51-250

Project Cost:

Cost not disclosed

Ownership:

Venture-Backed

Organization Type:

Private Company

AI Pattern:

Knowledge Management & Search (RAG)

Document Processing & Extraction

Value Type:

Time Savings

Cost Reduction

Headcount Avoidance

Risk & Compliance

AI Model:

Frequently Asked Questions

How did Arrive Health build a RAG system to clear FDA pharmaceutical PDFs?

The experts designed purpose-built preprocessing pipelines to ingest complex pharmaceutical PDFs, extracting and structuring drug information, then tuned a RAG architecture to the domain with semantic chunking and hybrid retrieval. To manage hallucination risk in a regulated setting, they embedded a proprietary Continuous Alignment Testing framework that monitored outputs at the developer, overnight, and production levels, keeping human specialists in the loop for final validation. The working drug-equivalency recommendation system was delivered in roughly 10 weeks.

What AI tools and approach powered the FDA document system?

The work combined RAG-based knowledge management and search with document processing and extraction, built on a custom/proprietary model on AWS. The RAG architecture used semantic chunking aligned to pharmaceutical data and hybrid retrieval, wrapped in a proprietary Continuous Alignment Testing (CAT) reliability framework.

What results did Arrive Health achieve?

The team delivered a working proof-of-concept-to-solution in 10 weeks, freed human specialists from upstream manual data processing so they could focus on validation and high-judgment decisions, and used the CAT framework to hold the system to the ~95%+ accuracy needed for confident beta rollout in a high-stakes healthcare context.

How long did the RAG engagement take?

Approximately 10 weeks of hands-on development — within the two-to-four-month range — compressing what had been projected as a multi-year initiative into a single quarter.

Who is this RAG approach best for?

Fortune 500 enterprises and growth-stage companies (Series A and up) in highly regulated industries — especially healthcare, financial services, and media — that need custom AI-native software, fast movement from POC to production, and reliability frameworks for non-deterministic AI outputs.