I Hope AI Won’t Make Us Completely Useless: NEAR Co-Founder on the Development of Artificial Intelligence

04.10.2024
20 мин
2029
9
I Hope AI Won’t Make Us Completely Useless: NEAR Co-Founder on the Development of Artificial Intelligence. Заглавный коллаж статьи.

Back in 2017, the NEAR team set a goal to create artificial general intelligence (AGI), but eventually — due to the technical limitations and lack of resources at the time — they shifted their focus towards designing a blockchain platform.

However, the developers never fully abandoned this idea. In 2024, the team made a pivot again and launched the NEAR AI research lab, which aims to achieve similar goals.

The editorial team at Incrypted spoke with the visionary behind this initiative — NEAR co-founder Alexander Skidanov. We discussed the role of blockchain in the development of decentralized AI, the challenges developers face, and the risks associated with the emergence of strong artificial intelligence.

As far as I understand, you are the one pushing for the AI component within NEAR.

Yes. I started talking about this back in 2021, but no one really listened at the time. Since 2021, I’ve already been working on some projects. Last year, I launched a company fully focused on AI research. Now, NEAR has finally found the resources to work on artificial intelligence. This company has now become part of NEAR and is called NEAR AI.

What led to NEAR’s pivot? You were developing a blockchain platform and now you’ve decided to work on artificial intelligence. As I understand, the goal of NEAR AI is to create AGI.

This was a dream of mine and Illia’s [Polosukhin], and I think many others of this generation share that dream. It feels like AGI is not too far away.

When we started NEAR AI in 2017, the goal was to develop AGI. But then we pivoted towards blockchain.

While working on AGI in 2017, the main issue was data annotation. We also noticed that even something as simple as paying people through PayPal in different countries was incredibly difficult. That’s when we started building the blockchain.

At that time, the only blockchains around were Ethereum, Bitcoin, and IOTA, and there wasn’t an easy way to make payments through such networks. Ilya and I, both being quite technical, figured that this problem was solvable if enough smart, systematic people worked on it. That’s how NEAR was born.

However, as I mentioned, I’ve been pushing AI initiatives within NEAR since 2021. We already have a large platform for data annotation. NEAR’s broader vision now is to enable people to own and control their money and data.

AI is going to develop very quickly. But right now, we’re once again on a trajectory where a small number of companies — OpenAI, Anthropic, and maybe Google — will end up controlling everything, much like what happened with search engines.

The more data these companies gather, the more they can improve their products, and the harder it becomes for external players to catch up.

It’s turning into a “walled garden” again.

Yes. One of NEAR’s main focuses has always been what we call the Open Web. But we won’t be able to build it if AI is “taken over” by companies like Google and Anthropic, as it’s becoming clear that AI will be a significant part of the internet. Everything is moving in that direction, and even the way we interact with applications is changing.

I gave a demo today — it’s hard to describe in text — but the idea is that we are slowly moving towards a world where the frontend of an application can instantly adapt to the user’s needs. AI could drastically change how we interact with the internet, applications, and infrastructure.

And if AI is entirely controlled by corporations, then the Open Web simply won’t happen. So, we need to start thinking now about what we can do.

How can we ensure that foundational models continue to exist in open-source form and compete with those of the big companies? How can we make sure that we have the infrastructure to deploy applications where data doesn’t leak?

Right now, we’re at a unique moment where most of the recent breakthroughs in AI are relatively well-known. For instance, we don’t know the specifics of what OpenAI is doing with their Q-sharp or Strawberry, but the way GPT-4 was trained is more or less public knowledge. Much of the know-how has either leaked or was created openly.

But I believe there’s a strong chance that in the next year or two, major breakthroughs will occur within OpenAI or Google, and they won’t be accessible outside of those companies.

So, how do we ensure that people who want to do AI research but don’t want to work for OpenAI still have access to the necessary resources? What can we put up against all these closed-door research efforts? These are critical areas to focus on if we want breakthroughs to happen, and that’s why we launched the NEAR AI lab.

What role does the NEAR blockchain play in this vision? Will it be related to data availability or the incentivization of AI agents — what functions will it serve?

It depends on the aspect. For example, data annotation is already fully happening within the NEAR blockchain. All payments to participants who perform data labeling, as well as the verification, are done on the blockchain.

For those unfamiliar — what is data annotation?

Let’s say you want to apply AI in a specific context. Today, individual models handle common internet tasks very well. For instance, why do models write code so effectively? Because there’s GitHub, which contains massive amounts of code. But if you have a task that’s rare on the internet, the model might hallucinate or perform poorly.

Because the dataset itself is smaller.

Yes. Because in the massive dataset the model was trained on, the specific subset relevant to your application is much smaller. So, you go to experts in the field and say, “Guys, create a dataset that closely aligns with the application we want to build.”

Usually, specialized companies handle this. You explain what you need, and they find people to create the data for you. This works well until you need to create a large volume of data — then the approach starts to break down.

Why? Because, ultimately, people prioritize making money over creating high-quality datasets for you. They will always find ways to do the minimum amount of work necessary to get paid.

With a small amount of data, you can maintain a certain team to check the work. But as the scale grows, it becomes harder and harder.

So, three years ago, I was developing a platform where people would verify each other’s work in such a way that the quality of the data wouldn’t drop.

Blockchain is a perfect fit here. And I believe that this problem is now fully solved. We haven’t opened the platform to external users yet, but we’ve already labeled a huge amount of data ourselves, which we funded for our own research. We’ll likely open it either this year or early next year. It’s entirely built on top of NEAR.

What do you think about what’s happening in decentralized AI?

I think the area where blockchain will see the most use, aside from DeFi and financial transactions, is, first and foremost, digital identity. Right now, everything happens in a centralized manner. My browser has hardcoded information about whom I should trust. This system won’t work in the long term.

Blockchain completely solves the trust issue. If you know my account name, that knowledge guarantees you can verify my signatures because the blockchain will always store the latest key.

Secondly, blockchain is useful for managing things that happen off-chain. For example, Sia is a great example of how everything happens off-chain, but certain markers are recorded on the blockchain. The same will happen in data monetization networks — everything happens off-chain, while the main network verifies that the data was transferred correctly, and so on.

Ultimately, almost all projects currently at the intersection of Web3 and AI use blockchain in this way. Something happens off-chain, and information is recorded in a decentralized ledger so people know how the data is being used and that the incentives reach the right users.

You mentioned that you don’t believe in companies that build AI-first products. What do you mean by that? Does OpenAI fit that category?

Yes. Of course, OpenAI and Anthropic have long surpassed the path a project needs to follow to make people believe it will succeed. But here’s another example: if you look at the latest group of Y Combinator participants and remove the AI filter, there will be only a handful of companies left. Every startup today says, “We are AI for something.”

The problem is that AI is just a tool. Companies are trying to make it the centerpiece of their product, which leads to a flood of projects that don’t differ from one another. They use the same models, and there’s no real distinction between them. That’s why most of them have no chance of surviving. The winner won’t be the one who builds the best AI but the one with a unique use case for their product.

Basically, companies are using existing models rather than developing their own. Is that correct?

Yes. Developing proprietary models is very risky because it’s incredibly expensive. And, of course, they lack the expertise that exists within OpenAI or Anthropic. I believe that creating your own model from scratch is absolutely futile for most companies today simply because they lack the resources.

Could this change with blockchain? Perhaps decentralized model training?

That’s a very interesting question. I don’t think blockchain, in isolation, could help. But if we don’t find a way to pool resources for model training, and if we don’t figure out how to conduct AI research using some sort of distributed grant system, we won’t be able to build a future where OpenAI and Anthropic don’t control the best models.

Creating and training models at the level of GPT-4 requires enormous computational power — clusters with thousands of processors, like the H100. If one company controls such a cluster, the disappearance of that company means we lose access to those resources. We’re forced to build a new cluster, which costs a fortune.

There’s a chance that Meta will continue to release models. If it weren’t for them, we would be in a much worse position. But we can’t build an open-source future around a company that is relatively closed off.

In the short term, what is the most likely scenario for how the blockchain community could develop a model at the level of GPT-5 or even GPT-4? It would involve some entity building a large cluster that the community funds together. This would result in a model controlled by that community, which would also share the revenue from its use.

However, in the long term, I believe this approach is also risky. If the mentioned entity disappears, the cluster disappears as well. To continue training the model, we need time to build a new cluster, and it must keep getting bigger and bigger.

So today, it’s already difficult to pool 30,000 processors. But if we want to move forward, we need more resources. We need people to find a way to train models in high-latency systems.

Then it’s enough to just tap into resources from people who have cheap electricity or idle GPUs. If we find a way to train models in this manner, it would completely solve the problem.

Interesting articles are already emerging from DeepMind and other labs describing how to train models with high latency.

What does high latency mean in the context of AI training?

Today, we require fast data exchange between processors. For example, all OpenAI clusters use InfiniBand, while Amazon uses EFA [elastic fabric adapter].

So we’re talking about the fast exchange of information between GPUs?

Yes. Because how are models trained today? You take a model and shard it across devices. There are three directions for sharding — model parallelism, pipelining, and replication.

Model parallelism and pipelining mean that the model is trained jointly across several GPUs. So, one model is distributed across multiple devices. Typically, within a machine — let’s say, for example, 8 GPUs — you use parallelism, and between machines, you use pipelining. Ultimately, both methods allow you to “slice” the model — like sharding in databases.

However, usually, the more you slice, the worse it works. Generally, for a model, you create up to eight shards within a GPU, and a maximum of about 40 such machines can be used. That gives you 320 GPUs. But you need 30,000.

This remaining factor, the remaining multiplier, will be replication. This is when you replicate the same model many, many times. You create, conditionally, 100 copies of that model.

But replication negatively affects training. So you don’t use large coefficients.

In the context of parallelism, you need a very fast connection. That’s why it’s only done between GPUs in the same machine. For pipelining, a very fast connection is desirable.

Replication is less demanding since each copy is independent. But then you want to combine all the gradients — this still has to work relatively quickly. Because the forward or backward passes take seconds, while the gradients — the whole model — consist of gigabytes of data.

Even if you have 200 billion parameters, you need to transmit a gigabyte. If the latency is high, even with a gigabit connection, it takes eight seconds.

So we do the forward or backward passes in two seconds, and then spend eight seconds combining the gradients. That doesn’t work.

And all ideas revolve around the same concept — let’s use an older gradient. In other words, we will actually spend eight seconds on the connection, but during that time we will continue to do the passes.

Will such an approach work? Gradually — yes. There is hope that this will be completely resolved in a year. Or maybe not.

But in a world where this problem is solved, we need to be ready to pool resources together, build a system where people can jointly provide GPUs, and start training a large model. We need to understand how registration will occur, how this model can be monetized.

The foundation of such decentralized AI will not be a separate project, but rather a group of protocols? So, for instance, Akash will provide the computing power for training, while some other project will handle the level of incentivization, and so on.

I can very well imagine a situation where some project can independently build something large. It’s hard to say that definitively, but we will certainly try.

Ultimately, I think it’s important for us as a community to have a path toward decentralized training of competitive models that will always be at the forefront.

There are rumors that GPT-5 will be released soon. And our cutting-edge model [with open source] – 405B is absolutely incapable of competing with GPT-4. So, GPT-5 comes out, and we’re already two generations behind.

Can we say that centralized models have a certain network effect? The more users there are, the more data they generate, the faster they evolve, and the harder it becomes to catch up.

Yes, absolutely.

If we talk about AI in general, what is your attitude towards people’s concerns regarding the very fact of the emergence of strong or even general artificial intelligence?

I am generally very pessimistic. I think the scenarios in which all this ends well are fewer than the opposite ones.

But the scenario where everything ends well is quite pleasant. Of course, when we look around, it’s clear that on a global scale, the average person is rather unhappy. They cannot reach their potential because they are forced to drive Uber every day just to make ends meet. They cannot pursue their dreams.

Artificial intelligence, potentially, could create a world where many more people can achieve their dreams, as claimed. I cannot confidently say whether this is true or not, but it is asserted that the amount of routine work will decrease while the resources in the world will not. Accordingly, the average person will have many more opportunities.

Don’t you think that in any case, under any scenario, this all leads to a world where humans no longer make decisions, where everything is decided by AI?

There’s a high chance of that.

I personally hope — though I don’t really believe it will happen — that in the near future, in the coming decades, AI will again reach some kind of ceiling that we won’t be able to break through.

I think that ultimately, in the very long term, if humanity doesn’t go extinct for some reason and continues to develop AI, sooner or later, artificial intelligence will become unattainable for us. We will become irrelevant.

However, I hope there is a chance of reaching some kind of ceiling where AI removes a lot of routine tasks and significantly improves our lives. Yet it either won’t become self-aware or, even if it does, it won’t make us completely useless.

That would be an excellent development. But for that to happen, we need to land in such a narrow zone that everything seems unlikely. It’s more likely that singularity could begin.

I can say that from the perspective of an average user who applies ChatGPT to solve everyday tasks, what is happening now seems a bit frightening. The extent to which AI handles individual tasks evokes both excitement and dread.

Yes. But in defense of the notion that it may not be the end for us yet, I want to give the following example. In 2017, when we started working on NEAR, we were in a similar moment — between 2015 and 2017, incredible breakthroughs were happening constantly, every month. It seemed like it was impossible to stop.

However, from 2017 until the release of ChatGPT, there was a significant slowdown. Even when talking about GPT-3, the model was released in 2019, but it was very inconvenient, and the barrier to entry was very high. Most people didn’t even know it existed. Then a user-friendly interface appeared, and people realized they had access to the technology since 2019.

Existing models have absorbed terabytes of data that we, as humanity, have created throughout our existence. All these breakthroughs that have occurred over the last couple of years happened because we are increasingly “squeezing” useful information out of that data. And they can take us so far, certainly. But beyond that, we will need new datasets. Generating a new terabyte of data is not easy.

So there is a high chance that we will hit a ceiling on what the model can extract from the existing internet. Then the question arises: who can build models that extract data more efficiently? Who can generate data for training? Perhaps we can generate clean data — which is needed in significantly smaller volumes — in the necessary amounts?

There is a chance we will run into a data ceiling. However, there is also a possibility that the internet already contains everything we need; it’s just that we currently don’t know how to extract that information efficiently enough. Gradually, we may simply arrive at a point where some model processes all this information and becomes self-aware.

Did you like the article?

9
0

articles on the same topic

Zebu Live 2024: London will host one of the UK’s biggest Web3...
avatar Pavel Kot
04.10.2024
Bulls or Bears: Who Will Lead the Crypto Market in the Near...
avatar Ivan Babiuk
04.10.2024
Solana House Demo Day will be held in Krakow on October 6
avatar Dmitriy Yurchenko
02.10.2024
Log in
or