Customer details

Mountain View
20
Employees
Data Infrastructure
Arcee Enterprise
Arcee enterprise
Case Study

GenAI for Law: Domain-Adaptation of a Language Model Specialized in Patents

With customer Activeloop, we've co-created PatentPT – the most advanced LLM and retrieval system for patent search and generation, trained on U.S. Patent data

50%

Fewer Hallucinations

2.5x

Faster Response Tiimes

< 1 month

Model Delivered in 3 Weeks

THE PROBLEM

Activeloop helps enterprises to organize complex unstructured data and retrieve knowledge with AI, with many of its customers working in heavily-regulated industries.

For a subset of its customers, Activeloop needed to provide highly accurate AI search across all U.S patents, and build a patent generation engine powered by a custom language model.

The U.S. Patent and Trademark Office (USPTO) website is a portal to an incredible amount of knowledge: the USPTO dataset consists of over 8 million patents, and its corpus of text contains some 40 billion words.

But – as anyone who has visited the USPTO website can attest – it’s a site that’s notoriously difficult to navigate, with a slow and rigidly-structured search engine (we suspect it’s running on Cobalt servers, without any neural network execution).

When Activeloop approached us to co-develop a GenAI approach to U.S. Patent data, we were thrilled with the opportunity. Together, we saw it as a challenge to make the incredibly rich dataset of U.S. patents more easily accessible to a broader audience. 

The goal was to build a retrieval engine with powerful search and generation capabilities – including:

  • Autocomplete
  • Patent search on Abstracts
  • Patent search on Claims
  • Ability to generate Abstracts
  • Ability to generate Claims 
  • General chat.
Davit Buniatyan
Founder

Arcee AI paves the way in domain-specific Small Language Model development. We’ve collaborated on PatentPT... and also on a combination of other bespoke and fine-tuned SLMs by their team. If you’re looking for a great partner that has the best expertise in unlocking the value of language models for your private data at a reasonable cost, Arcee AI is the perfect choice!

THE RESULTS

  • FAST TIMELINE FROM DATA TO DEPLOYMENT
    We successfully built, trained, and deployed PatentGPT in less than three weeks, leveraging Arcee Enterprise to train a custom language model, and Activeloop Deep Lake's ability to structure and accurately retrieve unstructured text data, as well as the Deep Lake dataloader for model training.

  • PERFORMANCE THAT BEATS OPENAI
    With Deep Lake query engine, and Arcee's suite of optimization tools, we achieved 50% fewer hallucinations and 2.5x faster response times vs. OpenAI Ada+Pinecone setup.

PRODUCTS USED

Activeloop deployed Arcee Enterprise built on AWS to ensure they had the most secure, resilient, and cost effective environment for their domain-specific Generative AI models.  

With Arcee Enterprise built on AWS their data never leaves their VPC.

PatentPT was also powered by Activeloop Deep Lake for data storage, retrieval, and model training.

Make your GenAI ambitions a reality with Arcee AI’s end-to-end system for merging, training, and deploying Small Language Models (SLMs).

Try our hosted SaaS, Arcee Cloud, right now – or get in touch to learn more about Arcee Enterprise.

Contact us