The Untold Truth: What We Learned from LLM Implementation in Life Sciences Analytics
In our six-plus years nestled within the intricate realm of life sciences, we've navigated a landscape rich with challenges faced by our customers. As a conversational analytics platform, our enduring mission has been to harness the power of natural language interfaces to deliver insights with unprecedented speed. While WhizAI has adeptly addressed a multitude of issues, new and technically formidable challenges perpetually beckon. Then came GPT, a pivotal moment that catapulted us into the realm of experimentation with Large Language Models (LLMs). Our paramount task-- identifying the most pertinent problems that LLMs could elegantly solve.
In this blog, we will be sharing this journey - problems that we were planning to solve, challenges we faced, the techniques we used to solve those challenges, and our learnings around what it takes to stay ahead of the curve for life sciences and healthcare analytics.
The Vision: Access Insights Anytime, Anywhere, Anyway You Choose!
- We wanted to ensure that customers can ask complex natural language questions over their information stores in any way they choose.
- Also, in our quest to become a true GenAI platform, we wanted to leverage LLM to do what it does best - Comprehend and summarize the data in simple words that are easy for our end users to consume.
Experimenting with Models: A Diverse Landscape
In our quest to overcome the challenges mentioned, we engaged with several LLM and SLM (Small Language Model) experiments, including Meta, OpenAI, Hugging Face, and MistralAI. This experimentation was vital in unraveling the potential applications that LLMs could unlock in the life sciences domain.
Challenges We Faced: Innovating Solutions
In analytics, accuracy is paramount. One of the significant challenges in using LLMs has been minimizing hallucinations, primarily because the way LLMs interpret information can often differ from industry expectations and out-of-box solutions are not adequate in all cases.
- Addressing Performance Bottlenecks: Converting natural language questions into precise LLM-understandable inputs, interpreting the output, and then generating an accurate, timely response for the end users posed significant hurdles. All this while controlling GPU costs is crucial, as only viable solutions can be sold. Balancing accuracy, speed, and cost is now the key challenge.
- Managing High Costs of Implementation: GPUs, critical for powering LLM computations, are expensive and ensuring system scalability and speed without compromising accuracy is crucial. They are now widely utilized for complex computations in machine learning, data processing, and scientific applications due to their high processing power. The challenge: supporting thousands of simultaneous users efficiently. How do we ensure lightning speed with better accuracy at lower costs?
Solutions: Innovating Beyond Barriers
- Enhanced Accuracy: We experimented with below approaches that helped improve accuracy.some text
- Prompt Engineering Iterations: Instructed the models on exact structure and layout of the output
- Creativity Suppression: Curtailed model creativity to prioritize factual and structured outputs.
- Customized Fine-tuning: We fine-tuned our models using thousands of natural questions based on our industry experience, client feedbacks and surveys to improve accuracy.
- Post-Processing and Hybrid NLP: Implemented guardrails against model hallucinations.
- Human Feedback Integration: Leveraging user feedback for continual fine-tuning.
- Contextualization: Provided additional context to refine LLM interpretations.
- Improved Speed: We experimented with below techniques to solve performance issues.some text
- Efficient Inference: Employed Grouped-query attention (GQA) technique for swift inference suitable for real-time applications.
- Long Sequence Handling: The model employs Sliding Window Attention (SWA) for extended text sequences without compromising performance.
- Reduced Cost: We experimented with below techniques to account for better performance while limiting the costs.
- Activation-Aware Weight Quantization (AWQ) is a technique used to compress machine learning models by reducing the precision of their weights while maintaining model accuracy. AWQ adapts weight quantization according to the activations, allowing the model to retain high performance with reduced computational and memory requirements, making it highly suitable for deploying large models on resource-constrained devices.
What Does it Take to Choose the Right LLM?
We noticed that certain LLMs performed very well across others in below parameters.
- Data Privacy: Large Language Models (LLMs) pose potential risks to data privacy and compliance with regulations like HIPAA (Health Insurance Portability and Accountability Act) due to:
- Data Exposure During Training: LLMs trained on publicly available or improperly anonymized datasets might inadvertently store or expose sensitive information, such as personal health data.
- Uncontrolled Data Retention: When LLMs process user queries, input data may be logged or cached, creating risks of data breaches or unauthorized access.
- Lack of Contextual Understanding: LLMs may misinterpret or mishandle confidential or nuanced data, leading to unintentional leaks or compliance violations.
These challenges highlight the need for rigorous privacy controls, secure infrastructure, and adherence to data handling regulations when using LLMs in sensitive domains like healthcare.
- Ease of Customization and Tuning: Involves multiple iterations with tuning the model with input data. This will ensure better relevance and alignment with domain-specific requirements, driving greater value for end-users.
- Versatility of solving different customer pain-points: There are ‘n’ number of problems we want to solve in the life-sciences industry. We want to ensure that one LLM model finetuned on different use-cases performs as expected with no hallucinations, ambiguities or limitations.
- Accuracy, Speed and Cost: They form an interdependent triad that directly impacts customer satisfaction. Faster solutions enhance user experience but may compromise accuracy or increase costs if not optimized. Conversely, high accuracy ensures reliability but could slow processes or raise expenses. Striking the right balance ensures efficient, reliable, and cost-effective outcomes, delivering maximum value to the customer.
- Open-Source and Freely Available: Open-source models provide flexibility and customization opportunities, making them valuable for tailored use cases.
Key Learnings:
- Quality Training Data: LLMs are only as good as their training data, thus curating high-quality datasets is critical.
- Iterative Approach: Fail fast, learn fast, build faster. Choose a problem, run smaller iterations, which are easier to analyze and fix.
- Model Suitability: LLMs (Mistral, Llama2) are better suited for performing Complex Tasks (e.g. Advanced Narratives) than Models like T5 (SLM i.e. Small Language Models). LLMs are in a better position to understand the problem by virtue of their Large Context Size
- Quantization Techniques: give you faster performance but you should also pay attention to accuracy
- Performance Metrics: Incorporate concurrency, throughput, and latency into performance evaluations.
The Path Forward: Innovating Continuously
Our journey with LLMs has been one of discovery, innovation, and continuous learning. While challenges remain, our commitment to pushing boundaries ensures that WhizAI stays ahead of the curve in delivering transformative insights. By embracing experimentation, fine-tuning, and relentless optimization, we are redefining what’s possible in life sciences analytics.
The future belongs to those who innovate—and at WhizAI, we’re just getting started. Request a demo to learn how you can stays ahead of the curve with transformative insights at your fingertips.