A reader recently asked, “What is DeepSeek, why am I hearing about it everywhere, and is it a cybersecurity threat?” In order: An AI platform, money and politics, and most probably yes. Let’s dive into this and try to make some sense out of the sensational headlines.
DeepSeek itself is an Artificial Intelligence (AI) platform produced by a company owned by High-Flier, a Chinese hedge fund/investment company. While the official name of the company is private, it has come to be known internationally by the name of its flagship product, and we’ll refer to them in this article as just “DeepSeek.” While the company is not owned by the Chinese government, as a Chinese company the government has significant influence over the company, its products, and any of the data it collects – meaning that while the platform isn’t a government-owned entity, it might as well be. This becomes important later when we discuss the potential privacy issues that the platform poses.
DeepSeek is a multi-use platform based on Large-Language Model (LLM) operations. The first product they have released is a chatbot – a Generative Pre-trained Transformer (GPT) tool similar to ChatGPT from OpenAI or many of Microsoft’s CoPilot tools. Basically, you can ask the AI a question, and even if it has not been specifically trained on that question, if it has access to enough data it can produce an understandable answer in multiple languages that can be factually correct. I say “can be” factually correct, because if the LLM is trained on bad data, it will give out bad answers; and all LLM’s are susceptible to eventually starting to hallucinate. Hallucination refers to when the AI begins to repeatedly provide incorrect or non-factual answers, and often indicates when the system is beginning to reach the end of its ability to continue operating (reaching the “end of its envelope”). Think of it like a human brain – the more we learn, the greater the chance that we can accidentally get our “wires crossed” and conflate two things that are not related. Just to be clear, as DeepSeek specifically is very new, it has not begun to hallucinate and most of its answers have been factual so far.
A quick side-bar on AI, since the term is often confusing to approach:
Artificial Intelligence is a blanket term that includes many specialized areas of technology. This can be Machine Learning (ML) that is used to predict outcomes based on large numbers of previous events, GPT systems which can parse input in plain language and supply appropriate output based on massive databases of information in plainly understandable language, and other forms of technology centered around a digital (or quantum) system behaving in a similar way to the human brain. All these sub-groups are AI, but focused on different types of analytics and/or interactions.
You use AI every day: When you visit Amazon, YouTube, or really any other site online; there are ML algorithms that determine what products, videos, or other information gets suggested to you. When you talk to a chatbot on a site to get basic-level help with something, that’s an LLM “reading” your question and attempting to provide an answer – to various levels of success, but getting better.
What makes systems like CoPilot, ChatGPT, and DeepSeek so different is that they were not built with a specific set of subject-matter expertise. They are instead designed to be able to converse with humans on a huge variety of topics – much like a human would be able to talk about what they do for work, for fun, what was in a TV show they watched recently, etc. This makes GPT-type systems extremely useful for organizations that are looking to automate interactions with customers, employees, and other humans. While we’re not at a point where humans can be replaced by these systems, they are very good at “force-multiplying” employees, allowing the same number of employees to handle more calls, emails, and other interactions without becoming overwhelmed.
In summary, DeepSeek’s first product is a GPT-type AI platform based around a Large-Language Model. You can ask it questions on whatever topic you’d like, and it will nearly always be able to produce a factual, topical answer that makes sense. So, why all the furor over it?
The two primary reasons are money and politics.
From a money perspective – DeepSeek does not use the hardware that US- or EU-based companies like nVidia manufacture. The ability of any AI developer to build systems without these chips is extremely rare, and sent shockwaves through the technology markets. Because of DeepSeek’s ties to Chinese manufacturers, the ability to built the platform without non-Chinese chips and other components wasn’t a total surprise, but the fact that they produced a system on-par with US and EU competitors like Microsoft CoPilot was a shock to the technology community. nVidia an other companies found their stock prices sharply dipping in the days after DeepSeek launched, and that event may signal a downturn in the financial power of hardware manufacturers specializing in chips that power AI. As such, there are powerful monetary pressures being exerted by multiple companies and governments on all sides – leading to a lot of press coverage and no small amount of controversy.
From a political perspective, DeepSeek is a Chinese product. The company that owns the company that makes DeepSeek is a well-known Chinese hedge-fund, with significant ties to the Chinese Communist Party (CCP). Because of this, data shared with DeepSeek – including all the user information, all the questions that get asked, where everyone is asking them from, etc. – will almost absolutely be visible to the CCP. This leads to a not-unreasonable fear that confidential or even classified information could end up in the hands of a foreign government who is not precisely friendly to US, EU, and other interests. Based on what we’ve seen end up in ChatGPT, CoPilot, various and assorted AI-generated porn-bots, and many other examples; people share confidential, privileged, or outright illegal stuff with these chatbots – information you do not want in the hands of a hostile government. See the disaster over at muah.ai – https://www.malwarebytes.com/blog/news/2024/10/ai-girlfriend-site-breached-user-fantasies-stolen
This makes DeepSeek a double-whammy when it comes to controversy and media coverage since it impacts both major financial concerns and major political concerns all at once. But what about cybersecurity danger? As it turns out, pretty significant.
There are two areas where DeepSeek is a highly probable cybersecurity issue: privacy and overall security.
From a privacy perspective, it is well known that Chinese companies will – and are often required to – share data with the CCP. This means that all the prompts typed in, usernames, addresses, email info, etc. are almost definitely being shared with outside parties – namely the government and intelligence agencies of China. As you might guess, having whatever questions you’re asking being shared with another government isn’t something a lot of people are comfortable with. On top of that, there are indeed many people in multiple countries who are not tech savvy and will treat DeepSeek like Google – asking it anything and everything. Some of these people may work with highly confidential data (think health records) and/or highly classified data (think military). As has been shown in previous leaks from LLM’s, this absolutely happens, and it’s absolutely a privacy issue at the very least.
From the perspective of overall security, DeepSeek really dropped the ball. Within 48 hours of it being unveiled to the world, one research group had already found as serious vulnerability in the platform that could allow them to see data in the back-end of the system. Not 24 hours later, another researcher discovered that entire databases of information were visible to the outside world. While that researcher didn’t take a copy of all that data, it’s frighteningly possible that someone else did or will. See https://www.bleepingcomputer.com/news/security/deepseek-exposes-database-with-over-1-million-chat-records/
The fact that two major security issues were discovered in such a short amount of time indicates that the platform was not properly secured before it was launched. The possible reasons for this are varied – from not having the right security folks looking at things to the product being rushed out the door – but the end result is the same. The platform was launched with multiple security issues and inadequate defenses around them. There is a high likelihood that more will be discovered over time, and that data will leak, be stolen, etc. beyond just being shared with the CCP as most sources agree will happen no matter what.
And there you have it. A combination of monetary and political issues forced DeepSeek to the front pages, and ongoing cybersecurity questions – and the fact that DeepSeek is run by a Chinese company – indicate serious privacy and security issues. While we’re a global society, and not every useful or powerful product comes out of the country we live in, I still would not recommend the use of DeepSeek by anyone until all of these factors are dealt with, mitigated, or otherwise addressed.