How a Pharmacy Benefits Firm Cleared FDA PDFs in 10 Weeks
Arrive Health's pharmacy specialists offloaded FDA PDF review to a RAG-based AI that drafts drug equivalency recommendations — shipped in 10 weeks at 95%+ accuracy with human-in-the-loop validation.
10 weeks
Built and shipped in 10 weeks
The Challenge
Arrive Health, a pharmacy benefits manager, relied on highly manual, specialist-intensive processes to review FDA drug sheets and identify cost-equivalent drug therapies. Specialists would read complex pharmaceutical PDFs, compare drug equivalencies, and manually enter data into a rules engine — a slow, expensive process that limited how quickly the company could surface cost savings for insurers, pharmacies, and patients. Scaling this review process without proportionally scaling headcount was the core challenge.
What They Built
Artium partnered with Arrive Health and AWS to design and build a RAG-based generative AI system that preprocesses FDA pharmaceutical PDFs, extracts and structures drug information, and generates drug equivalency recommendations for human specialist review. The system was purpose-built with Artium's proprietary CAT (Continuous Alignment Testing) framework, which continuously monitors AI output reliability at the developer, overnight, and production levels to prevent hallucinations and ensure consistent performance. Built in approximately 10 weeks of hands-on development, the solution replaced the upstream manual processing step while keeping human specialists in the loop for final validation.
Artium began by designing purpose-built preprocessing pipelines capable of ingesting complex pharmaceutical PDFs — structured documents with regulatory language that standard parsers struggled to handle accurately. Drug information was extracted, normalized, and structured for downstream retrieval. The RAG architecture was then tuned specifically to the domain, with semantic chunking aligned to pharmaceutical data patterns and a hybrid retrieval layer optimized for both structured queries and open-ended equivalency questions.
To manage hallucination risk in a regulated healthcare environment, Artium developed its proprietary Continuous Alignment Testing (CAT) framework — a multi-stage reliability system that monitored AI outputs at the developer level, overnight in automated test suites, and in production. Rather than treating reliability as a final QA pass, CAT was embedded throughout the build cycle. Human specialists were kept in the loop for final validation, ensuring the system served as an accelerant to expert review rather than a replacement. The entire solution was delivered in approximately 10 weeks on AWS, compressing what had previously been projected as a multi-year initiative.
AI Role
The AI preprocesses FDA pharmaceutical PDFs, extracts and structures drug information from complex regulatory documents, and generates drug equivalency recommendations for human specialist review. It operates within a RAG architecture monitored continuously by Artium's proprietary CAT framework, which tests AI output reliability at the developer, overnight, and production levels to prevent hallucinations and ensure consistent recommendation quality.
AI Model
Custom / proprietary
Infrastructure
AWS (cloud infrastructure and hosting) • RAG architecture (custom-built) • Proprietary CAT (Continuous Alignment Testing) framework • FDA pharmaceutical PDF ingestion pipeline
Integration Points
AWS services integrated with custom RAG retrieval layer • PDF preprocessing pipeline feeding structured drug data into vector store • CAT framework connected across developer, CI/CD, and production environments • Human specialist review interface receiving AI-generated recommendations
Impact
Proof-of-Concept to Working Solution in 10 Weeks
Artium delivered a working AI-powered drug equivalency recommendation system for Arrive Health in approximately 10 weeks of hands-on development — compressing what would previously have been a multi-year initiative into a single quarter.
Human Specialists Freed from Upstream Manual Processing
By automating the preprocessing of FDA drug sheets and generating equivalency recommendations before specialist review, the system significantly reduced manual data processing burden — allowing specialists to focus their time on validation and high-judgment decisions rather than initial data extraction.
CAT Framework Enables Reliable AI in High-Stakes Healthcare Context
Artium's Continuous Alignment Testing framework ensured the AI system maintained required accuracy thresholds (~95%+ for effective specialist use) in a healthcare context with strict LLM refusal challenges — enabling Arrive Health to roll out to a beta group with confidence in system reliability.
Implementation Complexity
The solution required purpose-built preprocessing pipelines for complex pharmaceutical PDFs, a custom RAG architecture tuned to domain-specific regulatory language, and a proprietary continuous alignment testing framework to manage hallucination risk in a regulated healthcare context — all delivered in approximately 10 weeks. The highly regulated environment, the specialised nature of pharmaceutical data, and the reliability requirements significantly increase implementation complexity beyond a standard RAG deployment.
Best Fit For
Fortune 500 enterprises and growth-stage companies (Series A and up) in highly regulated industries — especially healthcare, financial services, and media/entertainment — that need to build custom AI-native software, move quickly from POC to production, and require reliability frameworks for non-deterministic AI outputs.