- GPT-4 is scary good
- How a CBDC will impact digital assets
- Is Generative AI a threat to our privacy?
Welcome to our weekly mailbag edition of The Bleeding Edge. All week, you submitted your questions about the biggest trends in tech and biotech. Today, I’ll do my best to answer them.
If you have a question you’d like answered next week, be sure you submit it right here. I always enjoy hearing from you.
How does ChatGPT work, exactly?
I’ve read many articles (incl. yours) on ChatGPT that sound like people understand how it works. I certainly don’t even with my MS in physics and PhD in molecular biology.
If it was trained by working through all the data on the internet, it must have a gigantic memory. How else could it come up with answers very quickly as is reported?
Even considering a very large data center like Microsoft’s, it seems impossible that they could essentially store everything ChatGPT read on the internet. The sentences it writes must be copies of text fragments it read somewhere and then stored away. Can you please shine some light on this?
Hi, Chris. I’d be happy to answer your question. Because you’re right, it’s difficult to get our heads around a technology like this, especially when we consider the latest release of GPT-4 and the incredible things it can accomplish.
For instance, GPT-4 was able to produce the code for the popular “Pong” game in under 60 seconds. All it took was a simple prompt. The AI passed a simulated bar exam with a score in the top 10% of all applicants. It also created a website from scratch based entirely off a drawing on the back of a napkin.
Below, we can see the drawing and the result. Notice how the AI not only created a website, but even wrote a joke as prompted.
It’s not often that a new piece of technology floors me. But I have to admit, I spent some time this week experimenting with the tech, trying to grapple with what this means. The implications are profound. I’ll have much more to say about this in future issues.
But for today, your question. How, exactly, does a technology like this work?
First off, GPT-4 is different than its predecessors, GPT-3.5 and GPT-3. Aside from performance improvements, and a more nuanced understanding of prompts, the most visible difference is that GPT-4 is multi-modal. Specifically, it can receive both text and images as inputs as shown above.
GPT-4 still only outputs text, so it’s not a text-to-image generative AI. But being able to “see” and understand images is a powerful new tool for this large language model (LLM).
LLMs are trained broadly on the open internet. That’s an important distinction as the majority of the internet is behind a paywall or firewall and cannot be accessed by these LLMs for training.
The LLM ingests all of the information that it receives and through the use of deep learning, a form of artificial intelligence, the LLM synthesizes the information and builds an understanding of how all of the information is related and fits together. It does this by establishing relative weights of which types of information are considered most accurate and which words are best suited to go together. One way we can think about this concept is that the LLM establishes confidence levels regarding the correct answer or correct series of words to present as an output.
And yes, the computing systems are massive to be able to do this. OpenAI’s LLMs are all trained on Microsoft’s Azure data centers. GPT-4 ran for weeks at a cost of tens of millions of dollars before it was “trained” and ready for use. And it still requires massive amounts of both storage and computational power to maintain and run GPT-4.
GPT-4 first uses natural language processing (NLP) to understand the text prompt that it has been given. There have been some improvements in its NLP – it now has a more nuanced understanding of what is being asked of it.
Some of the simpler inquiries are more of a matter of recalling specific information from the LLM’s “learnings.” The simplest way for us to think about how it works is that GPT-4 basically has perfect recall of facts learned, like a photographic memory. Here is a perfect example of this:
I even intentionally used the incorrect word “tall” in my question. Using “height” or “altitude” would have been a more accurate way to ask the question, but contextually, GPT-4 still understood the information that I was looking for.
But longer-form text responses require the LLM to use its synthesized understanding after extensive training and then to write an answer. It does this by taking what it knows and then producing sentences word by word. Each word is selected by using the weights that it established through its training. Said another way, it chooses the most logical word to follow the previous string of words.
In that way, an LLM sounds mechanical; but where the magic happens is that it understands the context of what it is writing. It shows in GPT-4’s output.
I hope that helps, Chris. There are more technical answers, and there is definitely a little black magic that takes place with neural networks; but at a high level, the above should be good context. This latest version – GPT-4 – is just remarkable. Just imagine what GPT-5 or GPT-6 will look like by the second half of this year. And this kind of AI has incredible implications for your field of biology, and for that matter, life sciences and biotechnology.
We’re on the cusp of unimaginable progress in technological advancement.
One stablecoin to rule them all…
My question(s) have to do with cryptocurrencies, which I don’t understand very well, so please excuse my ignorance. I don’t understand the reason that all the DeFi companies issue their own token. As an investor, it almost seems like these tokens act like shares of stock in the company.
It also seems to me that when the Federal Reserve completes the conversion to a digital U.S. dollar, that “token” would become the default cryptocurrency for everything. Won’t that be a big, disruptive shock to the whole digital currency industry? How can we expect that event will affect the value of the tokens we are holding as investments?
– Richard R.
Hi, Richard. I’m always happy to answer reader questions. And these are important ones to understand as the digital asset industry navigates the next few years. For your first one: Why do blockchain companies issue their own tokens?
We can think of a token within a blockchain project as a “native asset.” In other words, it’s the “currency” used to facilitate transactions across that network. To understand this, we can look at the Ethereum network and its native asset, Ether.
The Ethereum network is a decentralized blockchain. It can execute smart contracts, validate transactions on the network, and much more. And if a developer wants to use the Ethereum network, they need to have some Ether to transact. That’s because Ether is the currency needed to facilitate these transactions. If Ethereum was a toll road, Ether would be the coin we drop in the bucket to gain access.
Hypothetically, if somebody wanted to make a peer-to-peer transaction across the Ethereum network, a portion of the Ether transferred would go to the “validators” that confirm the transaction and “write it” into the Ethereum blockchain.
That’s important because decentralized networks like the Ethereum blockchain require participation from several parties. There is no one entity that controls the network. And these parties must be incentivized for their work. That’s why they’re rewarded with small amounts of Ether for contributing to the network.
As for your other question, does that make native assets a security like shares in a company? This has always been a contentious issue in the industry. Some argue that native assets like Ether have utility. We purchase them with the expectation of using the network. And for that reason, they’re not a security in the traditional sense.
Theoretically, a DeFi company could use Ether to transact on its platform. But that usually doesn’t happen. The reason is that most blockchain projects get their initial funding not by selling equity, but by having a token sale, or series of token sales. In order to do that, the project needs to have its own token. Most blockchain projects tend not to look like traditional companies, so the typical structures of selling off equity ownership to fund the project tend not to make much sense. After all, the DeFi project is supposed to be decentralized…
So in that way, having a token enables the project to raise capital to pay for development, incentivize project contributions, incentivize involvement in the related ecosystems, and of course as a way to interact on its exchange/network.
And that’s where the issue with the SEC comes from. Projects are raising capital in exchange for the token. And whether or not the organization is centralized or decentralized, it is an organization working towards a goal in hopes of making that token appreciate in value. In other words, there is an expectation of a future profit. That’s why these tokens often “smell” like a security. And to your point, it almost feels like buying shares of a stock; but it’s not. After all, if that project/company were to go public, we’d only have the token, no stock.
For years, the industry has been begging for clarity from the SEC. They need to know what the regulators view as a security, if there are any exceptions, and how they should interact with investors/users of the network. To date, that clarity still hasn’t come. And it has been holding back the industry in the U.S.
With regards to your question about a digital U.S. dollar, how would the rollout of a U.S. CBDC impact the digital asset industry?
I’ve long held that much of the hostility from government regulators is an attempt to “pause” the industry while the U.S. finalizes its plan for a digital version of the U.S. dollar. In essence, they want to restrain the industry until the architecture is in place for the “Fed Coin.”
And once that happens, the industry will be allowed to operate within the confines that the government sets. As I’ve said before, the message from the U.S. government is basically: You can play in this space, but it must be by our rules.
But I don’t think it will be as large of a disruption as you suggest. After all, the Federal Reserve, the Treasury, and the White House primarily want to control the U.S. dollar in whatever format it exists. And it would like to do so on its own terms, with a digital wallet that it can control, and with financial services partners that it anoints.
Once that payment infrastructure is in place, the rest of the digital assets industry can “plug in” and interact. The industry is already very familiar working with U.S. dollar stablecoins, so working with a digital U.S. dollar from the Federal Reserve won’t be much different at all.
And if we keep in mind that blockchain technology is the next generation of internet technology, where network economics and monetary incentives are built into the technology architecture… this tech, and its associated digital assets, are going to survive the rollout of a CBDC.
And as with all things, the very best projects will thrive, and the weak ones will fail. And their token values will go up or down depending on how successful they are.
Is ChatGPT a threat to our privacy?
I am amazed by what AI in general and ChatGPT in particular can do to improve our productivity and, in many indirect ways, our standard of living. However, I am concerned about the possible threat it may represent to our private information. Can you share your thoughts on this?
– Antonio N.
Hi, Antonio. It’s great that you raise this question because it is so easy to become enamored by the power of ChatGPT. It’s easy to forget about things like privacy amidst all the excitement.
The reality is that whenever we ask a generative AI like ChatGPT a question, that interaction is ingested into the system. And companies like OpenAI have the ability to record any interactions that we have and build a dossier on us, just like Google, Meta, and so many others do. That doesn’t mean that they’re doing it right now, but it does mean that they can do it.
Every question we ask the AI tells it something about us. What do we like? Dislike? What do we believe? What are we skeptical about? What are we most interested in? All of this is valuable information, especially to a company like Google which is in the business of “mining” our data and using it to target ads.
In fact, this was the primary reason that JPMorgan recently banned the use of ChatGPT by its employees. After all, if JPMorgan employees are sharing sensitive data and making specific inquiries from which intelligence can be derived, that could post a significant risk. Most major banks followed with the same ban days after JPMorgan took that stand.
That makes sense. After all, ChatGPT is built by a third party – OpenAI – and it runs on Microsoft’s Azure cloud services platform. The inquiries are being sent over the internet, processed, and potentially stored in the cloud (off premise), and answers are being sent back over the internet.
For any company or government entity dealing with sensitive information, this is a major concern.
My prediction is that OpenAI, and other companies like it, will productize a general generative AI that they will sell/license to customers. That general product will be trainable on company/government-specific data sets, and then it can be accessible entirely on an intranet (on premise) without having to run the software on someone else’s servers.
So there are potentially ways to use this kind of powerful technology and protect privacy. But will those options be made available to normal people like us?
I believe that the most widely used generative AI offerings will be made available free, or for very low cost. This is the model that Google and Facebook used to become two of the most valuable companies in the world. But we should always remember, if it is free, we’re the product.
And as we learned from the Twitter files, the White House, the FBI, CIA, DHS, and many other U.S. government agencies were colluding with companies like Microsoft, Google, Facebook, and Twitter sharing information and censoring/banning scientific research and truth because it didn’t fit a political narrative.
In our current government, the incentives are immense to collect data from social media, searches, and yes, interactions with an AI. The desire to control us and how we think is worse than my worst nightmare.
What can we do? We should be looking for a generative AI company whose business model is not driven by advertising revenues. A business built entirely by subscription models (and one that has no history of colluding with the government) will be a good place to start.
We’ll be watching all companies in this space and will certainly share information on any companies that we feel are good stewards of our information and respect individual privacy.
Editor, The Bleeding Edge