Machine Learning Times
Machine Learning Times
EXCLUSIVE HIGHLIGHTS
How Generative AI Helps Predictive AI
 Originally published in Forbes, August 21, 2024 This is the...
4 Ways Machine Learning Can Perpetuate Injustice and What to Do About It
 Originally published in Built In, July 12, 2024 When ML...
The Great AI Myth: These 3 Misconceptions Fuel It
 Originally published in Forbes, July 29, 2024 The hottest thing...
Where FICO Gets Its Data for Screening Two-Thirds of All Card Transactions
 Originally published in The European Business Review, March 21,...
SHARE THIS:

2 years ago
Productizing Large Language Models

 
Originally posted on Replit.com, Sept 21, 2022. 

Large Language Models (LLMs) are known for their near-magical ability to learn from very few examples — as little as zero — to create language wonders. LLMs can chat, write poetry, write code, and even do basic arithmetic. However, the same properties that make LLMs magical also make them challenging from an engineering perspective.

At Replit we have deployed transformer-based language models of all sizes: ~100m parameter models for search and spam, 1-10B models for a code autocomplete product we call GhostWriter, and 100B+ models for features that require a higher reasoning ability. In this post we’ll talk about what we’ve learned about building and hosting large language models.

Nonsense

Any sufficiently advanced bullshit is indistinguishable from intelligence, or so the LLM thought. LLMs are super suggestible — in fact, the primary way to interact with LLMs is via “prompting.” Basically, you give the LLM a string of text and it generates a response, mostly in text form although some models can also generate audio or even images. The problem is, you can prompt the LLM with nonsense and it will generate nonsense. Garbage in, garbage out. Also, LLMs tend to get stuck in loops, repeating the same thing over and over again, since they have a limited attention span when dealing with some novel scenarios that were not present during training.

To continue reading this article, click here.

10 thoughts on “Productizing Large Language Models

  1. Pingback: Productizing Large Language Models « Machine Learning Times ✔️ Autocomp

Leave a Reply