149: AI Radiology for Vets: How Accurate Are Todays Tools Really? With Dr Steve Joslyn
We scrutinise one of the most practical yet under‑examined advances in veterinary practice: AI‑based radiology interpretation tools.
I sit down with veterinary radiologist and Vedi entrepreneur Dr Steve Joslyn to unpack the rise of AI-powered radiology tools in general practice. But this isn’t just opinion: Steve reveals the findings from his team’s recent study that put 6 commercially available AI radiology softwares in the spotlight - or up on the light box - to assess whether they deliver on what they promise.
From how these systems are trained, to where they shine (and where they fail), this conversation gives a no-nonsense look at what AI can actually do for your diagnostic imaging workflow.
What You’ll Learn:
- How these tools are built: Neural networks, down-sampling, and the truth behind “ground truth”.
- The data dilemma: Why most AI tools perform best in theory, not in general practice.
- Where they fall short: From image quality issues to breed bias and external validation gaps.
- New accuracy data: Insights from Dr Joslyn’s pilot study comparing six commercial AI tools.
- A decision-making playbook: When to trust AI, when to double-check, and when to avoid it entirely.
- Ethics and workflow impact: Who’s responsible? What do you tell clients? Can AI triage be trusted?
- How to stay future-ready: What’s coming next – and how to adapt without compromising care.
🎧 Listen now for the tools to ask better questions about AI in your clinic.
Find out how we can help you build you in your vet career at thevetvault.com.
Check out our Advanced Surgery Podcast at cutabove.supercast.com.
Get case support from our team of specialists in our Specialist Support Space.
Subscribe to our weekly newsletter here for Hubert's favourite clinical and non-clinical learnings from the week.
Join us in person for our epic adventure CE events at Vets On Tour. (Next up: Japan snow conference!)
Concerns About Veterinary AI Radiology Software
- Actual Accuracy: External validation studies conducted by Dr. Steve Joslyn's team found that the accuracy of the six leading commercial veterinary AI radiology interpretation tools is currently closer to 50/50 than the 90% to 95% claimed by the companies.
- Technical Fragility and Lack of Robustness The models exhibit concerning behavior when confronted with real-world scenarios:
- Decreased Performance with More Information: Unlike a human clinician, if you submit two views of an abdomen versus three views, the AI's performance can actually decrease. This happens because most systems process each image individually, and additional views increase the chance of conflicting information, compounding the system's "miss rate".
- Lack of Repeatability: When the same radiograph is submitted twice, but rotated slightly (no more than five degrees), the reports can disagree. Cases called normal in the first report were sometimes called abnormal in the second. This lack of repeatability is an alarming side effect.
Ethical Risks and Professional Liability
- The Blind Leading the Blind: The common practice of using AI tools as "an extra set of eyes" to confirm a vet’s feelings is dangerous because if the vet lacks confidence, relying on an inaccurate tool is "closer to the blind leading the blind".
- Trust and Misleading Claims: The tools may appear helpful because they generate the expected response. However, if vets trust the tools without understanding radiological interpretation, this can lead to increased morbidity and mortality for animals due to false positives (unnecessary surgery) or false negatives (missed needed surgery).
- Lack of Transparency and Regulation: Unlike in human medicine (where regulators like the FDA enforce strict testing and marketing standards), there is no clear regulator in the veterinary world. Companies are not forthcoming about the training datasets or testing methods they use, forcing vets to rely solely on the company’s marketing claims (e.g., 95% accuracy), which may be misleading.
- Liability: Most AI companies include terms and conditions stating that the interpretation is still the decision of the vet, meaning that if an animal is adversely affected, the liability falls back onto the veterinary professional.
The Data Gap in Veterinary Medicine
- In human medicine, algorithms can be trained on massive, consistent datasets (e.g., 120,000 frontal chest x-rays of the same species/breed, taken under perfect conditions with confirmed follow-up data).
- Achieving this same level of parity in veterinary medicine is incredibly challenging due to the introduction of different breeds, rotations, and views. It is estimated that 15 million x-rays would be needed to reach the same performance level, a massive project that would take years.