
LLM (Large Language Mode)
LLMs are built to work with language, which is why "language" is in the name. They can answer questions, write content, translate languages, summarize information, and more.
More recently, many LLMs have evolved into multi-modal models, which means they can handle not just text, but also images, audio, and other types of input, all in one conversation. For example, the latest ChatGPT models can respond to voice, analyze images, and generate text, all in a single chat. This shift began with GPT-4o, where the "o" stands for "omni" (meaning it can take in any mix of text, audio, and visuals).
If you're curious about how LLMs actually work, this primer is a great place to start. For a deeper technical explanation, check out this breakdown by Andrej Karpathy:
© 2025 Kumospace, Inc. d/b/a Fonzi