Explaining zkML: Towards a future of verifiable artificial intelligence

2023-05-30 06:52:03

As ZK technology improves, one will find several zkML use cases with strong product-market fit.

Written by: Avant Blockchain Capital

Compilation: GWEI Research

background

The past few months have seen several breakthroughs in the AI industry. Models like GPT4 and Stable Diffusion are changing the way people build and interact with software and the internet.

Despite the impressive capabilities of these new AI models, some still worry about the unpredictability and consistency of AI. For example, there is a lack of transparency in the world of online services, where most of the backend work is run by AI models. Verifying that these models perform as expected is a challenge. Also, user privacy is an issue as all the data we provide to the model API can be used to improve the AI or be exploited by hackers.

ZKML may be a new way to solve these problems. By injecting verifiable and trustless properties into machine learning models, blockchain and ZK technology can form a framework for AI alignment.

What is ZKML

Zero-knowledge machine learning (ZKML) in this paper refers to the use of zkSNARKs (a zero-knowledge proof) to prove the correctness of machine learning reasoning without exposing model inputs or model parameters. According to different privacy information, ZKML use cases can be divided into the following types:

Public model + private data:

Privacy Preserving Machine Learning: ZKML can be used to train and evaluate machine learning models on sensitive data without revealing the data to anyone else. This could be important for applications such as medical diagnostics and financial fraud detection. We have also seen some players use ZKML to build proof-of-humanity services on biometric data authentication.
Proof: In a world where most online content is generated by AI, cryptography can provide a source of truth. People are trying to use ZKML to solve the deepfake problem.

Private Model + Public Data:

Model authenticity: ZKML can be used to ensure the consistency of machine learning models. This may be important for users to ensure that model providers do not lazily use cheaper models or get hacked.

Decentralized Kaggle: ZKML allows participants in data science competitions to prove the accuracy of models on public test data without revealing the model weights in training

Public model + public data:

Decentralized reasoning: This method mainly uses the concise characteristics of ZKML to compress complex AI calculations into chain proofs similar to ZK rollup. This approach can distribute the cost of model serving to multiple nodes.

Since zkSNARKs are going to be a very important technology in the crypto world, ZKML has the potential to change the crypto world as well. By adding AI capabilities to smart contracts, ZKML can unlock more complex on-chain applications. This integration has been described in the ZKML community as "giving eyes to the blockchain".

Technical bottleneck

However, ZK-ML presents some technical challenges that must currently be addressed.

Quantization: ZKPs work on fields, but neural networks are trained in floating point. This means that in order for a neural network model to be zk/blockchain friendly, it needs to be converted to a fixed-point arithmetic representation with full computation tracking. This may sacrifice model performance due to lower precision of the parameters.

Cross-language translation: Neural network AI models are written in python and cpp, while ZKP circuits require rust. So we need a translation layer to convert the model into a ZKP based runtime. Usually this type of translation layer is model-specific and it is difficult to design a general one.

Computational cost of ZKP: The cost of ZKP will basically be much higher than the original ML calculation. According to the experiments of Modulus labs, for a model with 20M parameters, according to different ZK proof systems, it takes more than 1-5 minutes to generate the proof, and the memory consumption is around 20-60GB.

Smart Cost — Modulus Labs

status quo

Even with these challenges, we've seen a lot of interest in ZKML from the crypto community, and there are some good teams exploring this space.

infrastructure

Model Compiler

Since the main bottleneck of ZKML is converting AI models into ZK circuits, some teams are working on base layers such as ZK model compilers. Starting with logistic regression models or simple CNN models 1 year ago, the field has quickly progressed to more complex models.

The EZKL project now supports models up to 100mm parameters. It uses the ONNX format and the halo2 ZKP system. The library also supports submitting only part of the model.

The ZKML library already supports ZKP for GPT2, Bert and diffusion models!

ZKVM

ZKML compilers also fall into the realm of some of the more general zero-knowledge virtual machines.

Risc Zero is a zkVM using the open source RiscV instruction set, so it can support ZKP of c++ and rust. This zkDTP project shows how to convert a decision tree ML model to Rust and run it on Risc Zero.

We are also seeing some teams trying to bring AI models on-chain via Startnet (Giza) and Aleo (Zero Gravity).

Application

In addition to infrastructure, other teams have also begun to explore the application of ZKML

Defi:

An example of a DeFi use case is an AI-driven vault, where mechanisms are defined by AI models rather than fixed policies. These strategies can leverage on-chain and off-chain data to predict market trends and execute trades. ZKML guarantees a consistent model on the chain. This makes the whole process automatic and trustless. Mondulus Labs is building RockyBot. The team trained an on-chain AI model to predict ETH prices and built a smart contract to automatically transact with the model.

Other potential DeFi use cases include AI-powered DEXs and lending protocols. Oracles can also leverage ZKML to provide novel data sources generated from off-chain data.

Gaming:

Modulus labs launched Leela, a ZKML-based chess game that all users can play against a bot powered by a ZK-validated AI model. Artificial intelligence capabilities can bring more interactive functions to existing fully chained games.

NFT/ Creator Economy:

EIP-7007: This EIP provides an interface to use ZKML to verify that the AI-generated content for an NFT is indeed from a specific model with specific inputs (hints). The standard could enable collections of AI-generated NFTs and even power a new kind of creator economy.

EIP-7007 Project Workflow

Identity:

The Wordcoin project is providing a proof-of-humanity solution based on user biometric information. The team is exploring the use of ZKML to allow users to generate Iris code in a permissionless manner. When the algorithm that generates the Iris code is upgraded, users can download the model and generate proofs themselves without going to the Orb station.

Adopted key

Consider the high cost of zero-knowledge proofs for AI models. We think ZKML adoption can start with some crypto-native use cases where trust costs are high.

Another market we should consider is industries where data privacy is very important, such as healthcare. For this, there are other solutions such as federated learning and secure MPC, but ZKML can leverage blockchain's scalable incentivized network.

Wider mass adoption of ZKML may depend on a loss of trust in the existing large AI providers. Will there be events that raise awareness across the industry and prompt users to consider verifiable AI technologies?

Summarize

ZKML is still in its early days and there are many challenges to overcome. But as ZK technology improves, we think people will soon find several use cases for ZKML with strong product-market fit. These use cases may seem like a good fit at first. But as the power of centralized artificial intelligence grows and penetrates into every industry and even human life, people may find greater value in ZKML.

View Original

The content is for reference only, not a solicitation or offer. No investment, tax, or legal advice provided. See Disclaimer for more risks disclosure.

Reward
like
Comment
Share

Comment

0/400

No comments

Topic
#BTC#
240k posts
#PI#
239k posts
#ETH#
154k posts
4#GateioInto11#
80k posts
5#ContentStar#
67k posts
6#GT#
65k posts
7#BOME#
61k posts
8#DOGE#
60k posts
9#MAGA#
53k posts
10#SLERF#
51k posts

sitemap