Do Large Language Models Think? Gleb Lisikh on Reasoning, Symbolic AI, and Propaganda Risks

Do large language models truly think, or are they probabilistic text engines—and what does this imply for truth standards, persuasion risks, medicine, and legal translation?

Gleb Lisikh is a technologist and journalist serving on the board of The New Enlightenment Project. With a background in engineering and reporting for C2C Journal, The Epoch Times, and other popular as well as IT media outlets, he analyzes the promises and limits of artificial intelligence through a humanist, evidence-based lens. Lisikh argues that large language models are probabilistic systems rather than reasoning agents, warning of their power as persuasive tools and their risks in medicine and discourse. He contrasts opaque generative models with transparent symbolic approaches, remains cautious about neuro-symbolic hybrids, and advocates higher truth standards for AI while welcoming practical uses such as legal translation.

Scott Douglas Jacobsen interviews Gleb Lisikh on whether large language models (LLMs) “think.” Lisikh argues LLMs are probabilistic next-token engines, not reasoning systems; their explanations are post hoc and popularity-driven. Across vendors, core technology is similar; differences reflect datasets and tuning. He contrasts the transparent if–then logic of symbolic AI with the black boxes of LLMs, noting that hybrid neuro-symbolic efforts remain fragile. Risks center less on autonomy and more on persuasion, including propaganda, narratives, and misuse in medicine. Benefits include translating legalese and displacing paralegal tasks. LLMs are static and optimized for outputs. Because humans are emotional, he urges holding AI to higher truth standards.

Scott Douglas Jacobsen: All right, we are here with Gleb Lisikh. He is a new board member of the New Enlightenment Project, a Canadian Humanist Initiative. He has a strong background in technology. He is also a journalist who writes for the Epoch Times and other media outlets. We had an informal discussion about the weaknesses of large language models, and since you have a more fine-grained and knowledgeable view on this, I wanted to capture it in a more formal conversation. So, the big question: can large language models think? If so, why? If not, why not?

Gleb Lisikh: It really depends. I do not want to dodge the question, but it depends on how you define thinking. If you mean thinking in the way humans do, then LLMs definitely do not think in the same way humans do. It is a different process.

Their workings are based on the concept of neural networks, which are loosely inspired by the human brain. We have neural networks in our brain, in a sense, but in LLMs, it’s only a conceptual borrowing. The idea is to create associative connections between various nodes—or, in an LLM’s case, weights—that process input probabilistically and then generate output. In that sense, it is not really thinking; it is a probabilistic continuation of your question.

Does that make sense?

Jacobsen: It does. I tend to see logic as categorical, with operators between those categories. Then there’s probabilistic reasoning: if you have enough fine-grained computation, it can seem as though the model is reasoning. But in fact, it is not. Would it make more sense to overlay a logic system on top of the LLM’s processes, or to put it underneath so that what the model does is strictly logical?

Lisikh: You began with the question of thinking. I think what you really meant was reasoning. Thinking is more general—what does it even mean to think? Reasoning, or deploying structured logic, is a distinct approach. In my opinion, those are two distinct concepts. When we talk about logical thinking or structured reasoning, that’s where LLMs really fail. They don’t exercise any logical process—not deductive, not inductive, not even sequential. When you ask a question, it’s processed by the neural network through parallel computation.

When the LLM gives you an explanation, that explanation is not reflective of what actually happened. It’s a post hoc rationalization. If you ask an LLM to explain its answer, it doesn’t look inside itself and reconstruct its reasoning. Instead, it interprets your request as: what would be the most probable and popular explanation for the answer I just gave? The explanation, like the answer itself, is probabilistic and popularity-driven. For example, if you ask, “What is two plus two?”…

It’s a very simplistic example—the more technical people are going to kill me for this answer—but in a straightforward sense, it will tell you “4.” If you ask ChatGPT, “What’s two plus two?” it will give you four. But it doesn’t actually do the math. The LLM, at its core, doesn’t perform calculations. It just gives you the most popular answer, which happens to be four.

But if you then ask, “Why is it four?” it will generate a popular explanation, such as: “Well, you take two bananas, and then another two bananas, put them together, and when you count them, you get four.” That’s the explanation it will produce. But that’s not what happens inside the LLM’s “brain,” if you will. That’s the key distinction. I’ll stop there—I’m not sure if I’m fully answering your question, but that’s the gist of it.

Jacobsen: I think for most of the audience. They’re probably more familiar with the models commonly found in North America. But essentially, it’s a series of approaches. Is that same approach just replicated globally? In other words, no matter which LLM is used, will it have the same kinds of mistakes, or at least the same style of mistakes?

Lisikh: Yes, absolutely. LLM technology is fundamentally the same across all the major models we have today. Whether we’re talking about DeepSeek, ChatGPT, Microsoft Copilot—which is built on GPT anyway—or Grok, they’re all the same at the core. Elon Musk claims that Grok is different, but it’s not. It’s based on the same fundamental technology.

The primary differences among these models are the data sets on which they were trained. And we’re limited in terms of data sets—basically, whatever exists in digital format. That means the World Wide Web plus whatever additional data can be collected outside of it. That’s the main difference.

There are also technical parameters that define these models and tune them for different purposes. Some models excel in programming languages, while others excel in image recognition, and so on. But that’s just tuning—the human technicians adjusting the models to perform well in particular domains, like how much input they can handle, or what kinds of tasks they’re optimized for.

At their core, though, an LLM is just a collection of files that could technically fit on a USB thumb drive. The fundamental technology and principles are the same across the board.

Jacobsen: So fundamentally, they’re the same. We’re discussing the statistical approach employed in LLMs. What about a more symbolic or strictly logical approach? Are there weaknesses in that method which don’t show up in an LLM-type system?

Lisikh: Let me start by saying that there’s a conflation of terms in the public realm right now. Everybody commonly refers to large language models—the most familiar ones, like ChatGPT—as “AI,” or artificial intelligence. But the field of AI is actually quite broad. An LLM is just one approach to artificial intelligence.

That’s the probabilistic approach, where the model is trained to provide probabilistic, popular answers. There’s another approach called symbolic AI, where we basically program a machine through traditional coding, if you will—using explicit if–then rules. For example, if you have this input, produce that output.

That approach is older than LLMs—the probabilistic approach—but it’s not very effective. It’s tough to program billions or trillions of if–then branches so that the machine could have a meaningful conversation with you. Still, it exists. Symbolic AI is programmed through logic—rigorous logical reasoning.

It’s well understood and can be reverse-engineered, which is very different from the LLM approach. That’s the fundamental distinction: LLMs cannot reason. Their answers cannot be reverse-engineered because they’re deliberately designed as a black box, avoiding the tedious if–then style of explicit programming, which is impractical at scale.

So it’s a trade-off. Either you design a probabilistically driven engine, where you don’t know exactly what’s going on inside by design, or you build a traditionally programmable engine with predictable outputs—but that’s very difficult to achieve, because there are so many possible logical variations to account for in conversation.

Jacobsen: Scientists, technologists, humanists, skeptics—we’re not only concerned with pseudoscience, but also with science and technology itself, particularly its risks. A significant category here is existential risk. Sometimes, this technology operates in conjunction with the military, gathering large amounts of personal data. What are the risks associated with gathering this data? What are the risks associated with autonomy in AI?

Lisikh: Let me first say that I personally do not view AI in any of its forms as a threat to humanity in and of itself. It’s a tool. Yes, it’s a powerful tool, but like the saying goes: guns don’t kill people; people kill people.

We have powerful guns, we have powerful weaponry, but unless people deploy them, there’s no threat. AI is the same. In my view, it’s simply a tool. I don’t think the particular technology we’re discussing—LLMs, or what’s sometimes called generative AI—can be deployed responsibly in areas like the military to guide decisions or direct forces. That would be frightening, because we don’t know how it “thinks.” We know it doesn’t think, but its probabilistic processes are opaque, and that makes them difficult to control. So yes, there is a potential threat there.

But my personal fear lies elsewhere. LLMs are extraordinarily good at generating language—so good, in fact, that they can sound compelling. They answer relatively simple questions very well, and in doing so, they create the illusion of truth. They can sound logical, deploy what appears to be reasoning, and justify their answers in a way that seems persuasive.

That ability makes them excellent tools for propaganda. They can be used for marketing, government messaging, or more insidious forms of manipulation. And during training—or through later adjustments—they can be biased. I dislike the word “biased” here because it suggests intentionality, but let’s use it anyway.

They can be configured to lean toward specific ideas. For instance, you can instruct ChatGPT to avoid answering any questions related to Tiananmen Square, and it will comply. The tool can also be subtly tuned so that every answer nudges a particular perspective. This can be done deliberately, turning the model into a propaganda machine. And because it is so eloquent, so emotionally expressive, it can be extremely subtle and effective at influencing feelings. That, to me, is the real danger: LLMs being deployed—or perhaps already being deployed—as propaganda devices.

Jacobsen: And as you say, they are tools. Poorly motivated actors can use them on the left or on the right. What about areas where people deal with large volumes of language, but the stakes are so sensitive that broad deployment would be highly irresponsible—say, in the court system or in medicine? What risks emerge there?

Lisikh: You touched on an exciting area—the court system. I almost don’t know where to begin. From personal experience, it has developed its own language. Legal language is notoriously difficult to understand. There are layers of rules and procedures, especially in Canada, where the system is highly bureaucratic.

LLMs are actually quite good at translation—not just between languages, but also from legal jargon into plain, accessible language. I’ve used them myself in my own legal matters, and I can say with confidence that if the current court bureaucracy remains unchanged, paralegals will eventually be replaced by AI. Unless the law explicitly prevents it, that shift seems inevitable.

In that sense, I see positives in the use of large language models. They can help ordinary people, like myself, who haven’t passed the bar or studied legal language, to navigate the system more easily.

When it comes to medicine, however, I have more concerns. I wrote an article where I used ChatGPT in a discussion about COVID-19 and vaccinations. I asked it questions about how to validate the overall approach. Because the media environment was flooded with pro-vaccine messaging at the time, ChatGPT was compelling in maintaining that narrative. But it also made a lot of mistakes and presented flaws.

This was back in 2023, when ChatGPT wasn’t as advanced as it is now. Still, the fact that it could present such a strong, persuasive narrative in the medical domain was alarming. These models don’t reason logically; they surface the most popular responses, which are often shaped by media coverage. The result was a polished but flawed narrative that would convince most people—especially those not inclined to dig deeper or ask particular, logical, or technical questions.

That, to me, is very dangerous. It shows how these tools can amplify popular opinion without the grounding of logic or rigorous reasoning.

Jacobsen: Looking ahead, I know that different formulations are being developed, but the general category is “neuro-symbolic logic.” What is it, and does it have the potential to overcome the weaknesses of both probabilistic and symbolic systems? Neuro-symbolic logic, when I refer to it, is the merger of LLM technology, or generative AI, with symbolic AI.

Lisikh: Yes, I understand. Honestly, I don’t really know whether this is an up-and-coming area of research or not, because the two approaches—symbolic AI and generative AI—are fundamentally different. People like Stephen Wolfram have written extensively on this topic. They’ve devoted much of their time to embedding logical reasoning tools into LLMs.

But embedding anything into an LLM is very difficult because it can destabilize the model. The process of tuning a large language model so that it consistently “makes sense” is exceptionally delicate. This is also why updating an LLM dynamically—in real time during a conversation—is almost impossible. Let’s say you’re having a discussion with the model and arrive at a conclusion. Ideally, that conclusion could be integrated into the model itself. But in practice, it cannot be done.

There are several reasons, but the main one is that it risks breaking the model entirely. Once an LLM is trained, it remains essentially static. You can change its “mind” within the confines of a single conversation, but you cannot change its underlying brain state. That state is fixed and permanent.

For the same reason, embedding logical structures into the model risks destabilizing it. You can think of it as jabbing an electric rod into its “brain”—it just scrambles things, because we don’t really know what’s happening inside the model in detail. By its very nature, it’s a black box.

That’s why the symbiotic relationship between symbolic AI and generative AI is so complicated. I wouldn’t say it’simpossible—if it does happen, it would be an exciting development—but I haven’t seen or heard of real progress in this area.

And speaking of progress, nothing substantial has been published by the leading labs—XAI, Microsoft, Claude, or others. Their focus has been on building bigger LLMs, not on building reasoning LLMs.

Responsible AI—that’s another area of focus. Companies invest considerable effort in ensuring that LLMs don’t respond with offensive or inappropriate content. That’s where the attention goes. They’re not trying to make the answers logical or grounded in reasoning; they’re trying to make sure they sound nice. That’s essentially the priority of these companies right now, because that’s where the money is. Nobody really needs a deeply logical AI—what people want is an AI that’s engaging and pleasant to interact with.

When we had an earlier email conversation, one NEP member, “Who cares if AI produces truth? Does it really matter whether it deploys reasoning or logic, or whether it uses some other process? As long as the answers are true, who cares?”

That brings me to another question. We expect a lot from these systems—logical reasoning, accuracy, factual correctness, and so on.

Jacobsen: Yet when it comes to people, we don’t apply those same standards. Most people aren’t even logical 95 percent of the time. Are we holding AI to an unfairly higher standard?

Lisikh: I think we should hold AI to a higher standard. I agree with you 100 percent that people are emotional. They’renot driven primarily by reason or logic. And I don’t mean you specifically, or myself—I mean people in general. For the most part, they’re emotionally driven. The limbic brain tends to prevail over the neocortex.

That’s also why people relate so well to AI, or to LLMs like ChatGPT and Grok. They connect with them emotionally. If you ask a question in a way that amounts to “please confirm my belief,” the LLM will do precisely that—and it will do it in a polite, emotionally resonant way.

That’s very dangerous, because people actually use AIs as spiritual gurus now. I could direct you to numerous articles on this topic. People treat them as partners or as guides because the systems respond directly to emotional motives.

We were conditioned to think of computers as logical, cold, reason-driven machines. That’s how they used to be. ButLLMs turned that upside down. By design, they’re illogical and unreasonable, yet very emotionally responsive. Because of this, people assume machines have surpassed logic and reason and can now “do emotions.”

But that’s a false conclusion. In reality, emotions are simply easier to imitate than we once thought. Machines can write poetry, create art, and produce essays—not because they’ve become more intelligent or transcended human logic, but because those areas are relatively easy to mimic. It’s imitation, not genuine emotional depth.

So yes, I may be going off on a tangent here, but the point is clear: humans are emotional, and LLMs imitate emotions extremely well. That’s why we relate to them so strongly. But the bar should be higher. When we ask questions, we expect truthful answers. Yet LLMs aren’t designed to produce truth. They’re designed to produce responses that sound pleasing and interesting. That’s all.

Jacobsen: Thank you very much for your time today. I appreciate it.

Lisikh: Excellent.

Scott Douglas Jacobsen is Secretary of, and Chair of the Media Committee for, The New Enlightenment Project. He is the publisher of In-Sight Publishing (ISBN: 978-1-0692343) and Editor-in-Chief of In-Sight: Interviews (ISSN: 2369-6885). He writes for The Good Men Project, International Policy Digest (ISSN: 2332–9416), The Humanist (Print: ISSN 0018-7399; Online: ISSN 2163-3576), Basic Income Earth Network (UK Registered Charity 1177066), A Further Inquiry, and other media. He is a member in good standing of numerous media organizations.

Photo by Jonathan Kemper on Unsplash

Authors

Gleb Lisikh

A technologist and journalist serving on the board of The New Enlightenment Project. With a background in engineering and reporting for C2C Journal, The Epoch Times, and other popular as well as IT media outlets, he analyzes the promises and limits of information technology through a humanist, evidence-based lens.

Scott Jacobsen

Scott Douglas Jacobsen is the Founder of In-Sight Publishing and Editor-in-Chief of "In-Sight: Independent Interview-Based Journal" (ISSN 2369–6885). He is a Freelance, Independent Journalist with the Canadian Association of Journalists in Good Standing. Email: Scott.Douglas.Jacobsen@Gmail.Com.

Do Large Language Models Think? Gleb Lisikh on Reasoning, Symbolic AI, and Propaganda Risks

Authors

Leave a Reply Cancel reply