The Power of AI at CPU Prices

Intel Xeon 6 processors support enterprise artificial intelligence applications for a fraction of the cost of GPUs.

When generative artificial intelligence first hit the enterprise market, organizations were lining up to buy expensive, difficult-to-source GPUs to power their solutions. But as the market continues to mature, leaders are being more careful with their investments, trying to balance costs with tangible business benefits. And many are realizing that they don’t need the power of a GPU to support some of their most valuable AI applications.

The capability of large language models is defined by the number of variables they can adjust during training to learn patterns and relationships in data. These variables are called parameters, and LLMs can contain tens to hundreds of billions of parameters, depending on their design and intended use.

For many enterprise deployments, models in the 7 billion to 15 billion parameter range are often sufficient, especially for targeted use cases such as retrieval-augmented generation (RAG), summarization, document processing, and domain-specific assistants. This is the sweet spot for Intel Xeon 6 processors, powered by Intel Advanced Matrix Extensions.

The Right Processors Offer Four Big Benefits for AI

As the “funny money” era of AI draws to a close, organizations have largely stopped writing blank checks for GPUs. Instead, they are turning to powerful CPU infrastructure such as Xeon 6 to support high-impact AI applications.

Intelligent monitoring: Computer vision and intelligent monitoring illustrate how a well-tuned, rightsized model can perform every bit as well as one that is overprovisioned. Consider a hospital system that wants to use AI to track staff movement and presence in patient rooms to support compliance, operational efficiency and patient safety. A general purpose model might be 95% of the way there, capable of detecting human figures and tracking movement, identifying roles by uniform color. A healthcare system might use GPU infrastructure to fine-tune the model for its environment (training it to recognize that nurses wear burgundy scrubs, for example), but then hand off the model to a Xeon 6 processor. This way, organizations can rightsize their production environments while reserving their GPUs for intensive tasks including model training.

Image processing: Beyond video monitoring, Xeon 6 supports a range of image and document processing use cases. For example, AI models might extract structured data from unstructured visual inputs, classify images at scale or process scanned documents or forms. For organizations dealing with high-volume document workflows (such as insurance claims, logistics receipts or medical imaging), inference-stage image processing on CPU infrastructure can deliver the power they need without GPU-level spending.

Technical documentation: One of the most compelling use cases for rightsized AI is RAG, a technique that allows an LLM to pull answers directly from a curated set of documents, rather than relying solely on training data. In the military, for example, each aircraft might have a 10,000-page technical manual. When a RAG model is trained on this information, users can ask questions about specific components in natural language and receive instant answers, rather than wading through thousands of pages. Similarly, teams in fields such as healthcare, finance, engineering and law can use RAG-powered tools to quickly access information hiding within dense documentation. By supporting these models with Xeon 6, organizations can both cut costs and keep their sensitive data in-house.

Business unit support: Xeon 6 processors are powerful enough to run RAG-trained models that support core business functions at the departmental level. In HR, for example, a model trained on employee handbooks, benefits documentation and policy guides can become a self-service resource that handles routine questions without burdening HR staff. A sales model trained on product specifications and pricing documentation can help account managers answer customer questions faster, without escalating to product teams. And in engineering, a RAG deployment can provide a queryable interface for vast libraries of specs, design documents and historical project data. At Intel, we’ve actually deployed exactly this kind of model internally. When a customer asks a detailed question about whether a specific Intel Ethernet card carries a particular compliance certification, the answer is just an AI query away.

The ROI calculus of AI is shifting, and leaders want to know how much they’re spending — and what they’re getting for their money. Increasingly, they are realizing that turning to CPUs to power AI isn’t a matter of compromising but rather intelligently allocating resources. I think of it a bit like cars: Can you use a high-end sports car to make grocery runs? Sure. But even though it’s the most expensive option, it’s not really the right tool for the job.

Find out how to accelerate your AI infrastructure with Cisco and CDW.

Learn how

David Bickford

CDW Expert

view more work

David Bickford is a senior Field Solution Architect at CDW, specializing in Microsoft Voice solutions. Prior to joining CDW, he held leadership roles at a Microsoft Gold Communication partner and a major telecommunications company. He brings a wealth of experience and expertise in Microsoft solutions to CDW's collaboration practice.

view more work