
This material is part of RBC's new Education section , where we discuss how to develop skills, make informed decisions, and advance your career consciously.
The RBC School of Management is the media holding's new educational project, focused on executive development. We meet every Thursday at 7:00 PM for online events, where we tackle complex management challenges together.
The schedule and topics can be found here.
In 2025 , a study by S&P Global Market Intelligence found that 42% of companies had curtailed almost all of their AI initiatives, compared to only 17% a year earlier. It's not a matter of technology quality—they're working just fine. The problem is that companies are trying to solve all kinds of problems with the same tools.
We've seen many projects where teams deployed GPT-4 where a simpler model, like regression, would have been more effective. This resulted in the system becoming slow, money was quickly spent, and users were dissatisfied. But at company meetings, they kept repeating, "But GPT is the most powerful model!"
However, in such cases, power doesn't matter—it's important to choose a tool that's truly suited to the specific task. We'll explore why GPT isn't a panacea and what to do when versatility becomes a problem.
Why isn't there one neural network for all situations?If we need to hammer a nail, we use a hammer. We can try using a microscope—it's heavy, and we can hit it with it. But it's still the wrong tool.
The same thing happens with neural networks. There are different types of models, each developed for a specific data format. Some work better with sequentially presented data, such as time series. Others are more effective at analyzing text and large data sets. There are models specifically designed for image generation.
Each model type is designed for its own application, and this is supported by mathematics. The No Free Lunch Theorem states that there is no universal algorithm that handles all problems better than others. Different problems require different approaches—and this is a proven fact.
Therefore, even if a model is trained on one type of data, it won't automatically perform well on another. For example, if you take a network trained on cat photos and simply "add" it with medical images, the results will be weak without fine-tuning for the specifics of medicine.
What to look for when choosing a neural networkTypically, we start with three questions: what data we have, what we need to obtain, and how much of that data is available. The type of data is the basis for choosing a tool.
Data size also matters. If you have fewer than 10,000 examples, training large models like Transformers from scratch is ineffective. In this case, transfer learning is used—taking a pre-trained model (such as BERT, Vision Transformer, or MobileNet) and retraining it for a specific task—or choosing simpler algorithms.
Why ChatGPT and similar tools can't be used for all tasksLLM (Large Language Model) is a large language model like GPT-4, GPT-4.1, Claude 4.5, or Llama 3. It's trained on huge text datasets and can write, summarize, answer questions, and translate. But it's not always the best tool.
The biggest mistake is using large language models everywhere possible. Explosion.ai calls this "LLM maximalism": companies integrate LLM into every process. Need to filter spam? Use GPT-4. Create a summary? Again, GPT-4 or Claude 4.5. Extract dates from text? Again, LLM.
Problems arise immediately. These models are slower than conventional algorithms, and users are unwilling to wait ten seconds for a response that previously took one. Costs also rise: each unit of text processed by the model (token) costs money, and LLM spends thousands of dollars.
A real-life example: a company's reputation monitoring system (automatically tracks, analyzes, and evaluates what's being said about a brand, company, or person online) was initially built entirely on LLM. The model filtered texts, generated summaries, and extracted the necessary data. But after a month, it became clear: it was too slow for real-time operation, too expensive as the data volume increased, and impossible to compare the summaries with the original texts.
The solution turned out to be simple. The architecture was split into parts: first, a regular classifier operates, filtering out noise and breaking the text into sentences, while LLM is used only for summaries. As a result, the system became faster and significantly cheaper.
When Speed Matters More Than Power: Which AI Models to Use and WhyTransformers are today's leading tools for working with text. They can analyze very large volumes of information and even process text, images, and audio simultaneously.
But they have a drawback: the longer the text, the slower they work. With huge amounts of data, transformers become too "heavy," so faster models are needed for such tasks.
That said, older architectures—RNNs and LSTMs—remain useful. They're faster, require fewer resources, and are suitable for devices that process data locally rather than in the cloud. They perform excellent in real-time tasks—for example, they can accurately recognize human actions based on sensor data.
Diffusion models have greatly advanced AI-powered image and media creation. Tools like Stable Diffusion, DALL-E, and Midjourney create high-quality and diverse images and can also handle audio, video, and code.
However, these models are slow: they require a lot of computation to generate, so they are not suitable for applications where the result is needed instantly.
Data Drives Results: How Models Tolerate Error and NoiseLarge neural networks are very sensitive to data quality. If the data contains errors, incorrect labels, or a bias toward one class, the model will consistently make mistakes. Without preliminary data preparation—cleaning, straightening, and standardizing the data—the results of such models become unpredictable.
When we say "neural networks," many people imagine something huge and complex. But there are also simpler, more stable, and more predictable algorithms that often produce excellent results when the data is junk or incomplete. These include Random Forest and XGBoost.
A good example of failure is the models that were attempted to detect COVID-19 from images. Most of these studies failed in real life. The main reasons were simple:
First, understand the problem, not the models. Once you understand what exactly needs to be solved, it becomes clear which type of model is best suited.
Don't limit yourself to a single method—test several models and compare which one produces the best results. AutoML tools can help with this—these are services that automatically select the optimal algorithm, configure its parameters, and check its performance.
AutoML (automated machine learning) is a tool that automatically selects the best model, configures it, and verifies the results.
That is, instead of manually:
There are also tools like AutoGluon, FLAML, or H2O that allow you to quickly try dozens of model options and choose the one that shows the best results for your data, without manual tuning or lengthy experiments.
Don't just look at accuracy. It's important to understand:
Sometimes a model with less accuracy but faster response is a better choice for real-world work.
And be sure to record all experiments: what parameters you tried, what results you got. This will allow you to reproduce successful solutions in the future and help your colleagues continue their work without guesswork.
Tools to help you choose the right model