1 of 16

Basics

Welcome to Kinesis Network

Introducing Kinesis Network: Transforming the World’s Untapped Compute Power

Each year, millions of high-performance CPUs and GPUs are produced, destined for datacenters, businesses, research labs, tech companies, and personal devices. Yet despite this immense global output, much of this compute power remains severely underutilized. Studies on datacenter efficiency reveal that cloud customers typically use less than 20% of the CPU/GPU capacity they pay for. This means the majority of compute resources go to waste—billions of dollars in unused potential that could be powering innovation and discovery.

But it’s not just in datacenters. Compute power is all around us: in smartphones, personal computers, tablets, gaming consoles, wearables, IoT, and more. Incredibly, the smartphone in your pocket today has more computational power than the Apollo Guidance Computer (AGC) that took humans to the moon. Tasks that took the AGC an hour to compute can be accomplished by your phone in mere seconds. Yet, this vast power remains fragmented and inaccessible.

Kinesis Network is here to change that. We harness and unify these underutilized resources into a powerful, fully-managed compute cloud. Designed to serve enterprises, AI startups, and research institutions, Kinesis provides on-demand, scalable compute power—without the overhead of infrastructure management.

Unlocking Global Idle Capacity: Turning Potential into Kinetic Energy

At Kinesis Network, our very name reflects our mission: converting idle, untapped compute potential into kinetic energy that drives innovation. Imagine a world where every idle CPU or GPU, no matter where it resides, becomes part of a global compute grid. Whether it’s powering consumer apps with complex AI models like LLMs, performing advanced weather forecasting for predicting the paths of hurricanes, accelerating pharmaceutical research, or powering protein-folding simulations to cure diseases like cancer, Kinesis Network unlocks this dormant capacity and makes it available to meet humanity’s growing computational needs.

As the demand for compute power surges—driven by advancements in AI, scientific research, climate modeling, and other critical fields—Kinesis offers a sustainable, efficient way to access this power. Instead of building new infrastructure, Kinesis taps into the compute potential that already exists, leveraging idle resources from data centers, research labs, and even individual contributors.

The Future of Compute: Scalable, Sustainable, and Optimized

Kinesis Network is not just unlocking idle compute power—it’s transforming the way the world thinks about computing. By pooling underutilized resources from across the globe, we offer a cost-effective, scalable solution that adapts to the needs of enterprises, startups, and academic institutions. No more overprovisioning. No more wasted capacity. Just optimized, on-demand compute power, ready when it's needed.

Our Core Tenets

At Kinesis Network, we are guided by principles that ensure our customers never have to compromise. These tenets drive everything we build and innovate:

Reliability
- Our technology is designed to be dependable because we know our customers’ businesses—and their own customers—rely on us. Reliability is a foundation we never compromise on.
Security, Privacy, and Confidentiality
- We adhere to strict compliance standards and only use resources that meet our rigorous benchmarks to safeguard customer data and ensure confidentiality.
Performance
- We deliver optimal performance for all workloads, unless customers intentionally choose to trade performance for cost savings.
Transparency
- We operate with complete transparency, never using tools or practices that we cannot proudly explain to our customers.
WYSIWYG (What You See Is What You Get)
- With Kinesis, there are no hidden costs or surprises. Our customers always know exactly what they are paying for.
Simplicity
- Simplicity is at the core of our product. We hide backend complexities behind an intuitive, user-friendly interface, making everything—from usage to pricing—straightforward and easy to navigate.

These principles are the backbone of Kinesis Network, ensuring trust, clarity, and excellence in every aspect of our platform.

Understanding How

How Does Kinesis Network Enhance Customer Value with Optimized Compute Service

As computing demands surge, particularly in AI and machine learning, enterprises require scalable, high-performance infrastructure. Kinesis Network meets this need with a serverless, fully managed Compute Cloud that leverages pooled, unused resources to deliver efficient and sustainable compute power. This post explores how Kinesis Network provides enterprise customers with the tools they need to focus on innovation, while simplifying and reducing the costs of infrastructure management.

Key Challenges in Compute Infrastructure

Rising Costs: AI-driven organizations often face high expenses related to GPU resources. With Kinesis Network’s pay-as-you-go model, enterprises pay only for active compute usage, avoiding the idle costs common with traditional cloud services.
Sustainable Resource Use: Many businesses are concerned with the environmental impact of underutilized compute resources. By pooling idle capacity, Kinesis Network helps reduce waste, contributing to a lower carbon footprint.
Complex Infrastructure Management: Not all companies have the specialized skills to handle GPU-intensive infrastructures. Kinesis Network addresses this by offering a fully managed platform that eliminates infrastructure complexities, allowing companies to focus on their core applications.

Proprietary Technology for Seamless Compute Sourcing

Kinesis Network’s proprietary software optimizes compute resource allocation from diverse sources. Key features include:

Resource Pooling and Tunneling: By aggregating unused GPU resources, Kinesis Network provides scalable, high-performance compute power, allowing companies to access robust infrastructure without the need for dedicated hardware.
Automated Scaling: The platform dynamically allocates resources based on demand, making it suitable for a wide range of workloads, from periodic batch processing to continuous applications.
On-Demand Flexibility: Kinesis Network’s on-demand scaling, paired with a per-second billing model, enables enterprises to manage variable workloads effectively, avoiding long-term commitments and unnecessary costs.

Benefits for Enterprise Customers

Cost Efficiency: With a usage-based pricing model, Kinesis Network reduces costs compared to traditional cloud services, allowing businesses to optimize their budgets and pay only for the resources they actively use.
Scalability and Flexibility: Access to the latest high-performance GPUs makes it easy for enterprises to handle demanding applications like AI training and inference, while allowing them to scale resources up or down as needed.
Quick Onboarding: The platform streamlines the onboarding process. Businesses can directly upload Docker containers, configure requirements, and let Kinesis Network handle resource management, helping companies deploy projects faster.
Sustainability: Kinesis Network’s approach to pooling unused capacity not only optimizes resource use but also aligns with companies’ goals to reduce their environmental impact.
Multi-cloud: Kinesis Network makes multi-cloud goals easy. Customers simply specify their desired cloud distribution—for example, 60% AWS, 30% Azure, and 10% GCP—and we handle the rest. Our unified compute pools across major providers allow seamless VM selection, giving you flexibility and control.

Kinesis Network is an ideal solution for enterprises seeking to improve efficiency, scalability, and sustainability in their compute infrastructure. Its fully managed, serverless model allows companies to focus on innovation, reducing costs and minimizing the complexities of infrastructure management. For customers ready to optimize their infrastructure, Kinesis Network provides a compelling, forward-thinking solution.

How does Kinesis Network Work?

Kinesis Network operates with the efficiency, ease of use, and affordability of a power outlet—but offers greater control and transparency.

The electricity in your home comes from diverse sources—large power plants (nuclear, solar, wind, hydro) and smaller generators like your neighbors' rooftop solar panels. The power grid manages and conditions this electricity, delivering it to you in a consolidated way. You receive a single bill, regardless of the source. There's no need to hesitate before using power; you can host a dinner party or run the dishwasher without worry. The supply of electricity scales invisibly with your demand. If you choose, you can even install your own solar panels and sell excess power back to the grid, lowering your bills.

Kinesis Network assembles a diverse array of compute participants, varying in type and scale, from around the world into a unified compute grid.

Kinesis' compute providers span a wide spectrum, ranging from large data centers to smaller facilities and even individual devices like gaming computers in homes or idle machines in offices.

As a customer, you maintain complete control over your compute source. For proof-of-concept work, the most cost-effective option might be using compute resources from around the world. However, for sensitive tasks involving personally identifiable information (PII) or subject to regulatory compliance, dedicated data centers may be more appropriate. Additionally, if your existing infrastructure is already based in a specific data center (such as AWS or Azure), staying within that ecosystem could be optimal.

Kinesis Network consolidates the compute

Although our network allows low-level access to individual resources in the underlying infra, the real value comes from seamless unification of those resources in an easy to use 1-2-3 fashion.

Customers upload their containers (in Docker or other supported formats) through the Kinesis Portal.
Afterwards, they make their preferences in terms of minimum hardware specifications (such as minimum RAM, VRAM, CPU speed needed).
Upon pressing the start button, the Portal provisions the best servers from the grid. It constantly monitors the performance of the group of servers, and replaces them as needed, or adds/removes to/from the groups compute power.

These happens entirely seamlessly, automatically and responsively.

Kinesis Network gives secure and easy access to your workloads

Your applications can easily and securely access containers running in the grid. While direct HTTP or raw TCP access to individual servers is available (e.g. for development purposes), we provide a front-end load balancer to simplify the backend complexity. When a customer makes a request, the nearest data center responds with the most performant server available at the moment.

Individual resources are instantly scaled up, down, in, or out as needed. If the health of some servers degrades, new ones are activated. This orchestration is seamless and automatic, operating within your specified preferences and performance goals.

You pay for only what you actually use

Similar to your electric bill, you only pay for what you actually use, not for what you've reserved or could have used. We meticulously track every instance of usage—down to milliseconds for CPU/GPU usage and bytes for network I/O, storage, and RAM/VRAM occupation. Despite the intricate metering and billing processes, we simplify it for you: you receive a single, comprehensive bill.

For a deeper understanding of your usage, we provide detailed dashboards that show where and when activity occurs. Unlike instance or VM-based workloads, our dashboards display usage per application, revealing the true sources of costs. This insight allows you to target specific areas for optimization and better inform you about your business needs.

Kinesis Network is economical

There is plenty of compute in the world, but

some is overly expensive — mega datacenters can charge high premiums
some is not easily accessible — despite idle capacity in data centers or homes, it remains untapped
and some is inefficiently utilized — cloud providers often offer fixed-size instances, leading to wasted resources like unused RAM, VRAM, or CPU power. For example, virtual machines typically use less than 20% of their CPU capacity, yet still paying for 100% of the reservation.

Kinesis Network mobilizes underutilized computing resources, passing the savings on to you.

You can contribute to the network in exchange for credits or the specific compute resources you need.

If you have idle compute resources (perhaps due to over-provisioned reserved instances from AWS, Azure, or other providers), you can offer them back to the grid. As these resources contribute to the network, you'll earn credits for your own use—or even turn a profit if you contribute more than you consume. This system also allows you to convert excess capacity from one type to another. For instance, if you have surplus CPU power but need GPU capacity for an upcoming AI applications, you can efficiently trade your CPU resources on the Kinesis Network for the GPU power you currently require.

How does Kinesis Network achieve savings?

Compute landscape is a competitive landscape. How can Kinesis offer any money savings? Below, we will explore some business and technological strategies we employ to save money and pass on the savings to our customers.

Business: Economies of Scale

Bargaining Power

As Kinesis expands, its bargaining power increases significantly. A single customer requesting 5 machines from a vendor is not the same as Kinesis negotiating for 5,000 machines. At such a large scale, we secure bulk discounts or exclusive business deals, enhancing cost efficiency.

More Predictability

With larger workloads and wider variety of customers, we can average out some core workload. This allows us to commit for longer terms plans (e.g. reserved instances, savings plans, etc.) that individual customer’s themselves cannot commit to.

Joint Ventures / Partnerships

Our scale can offer joint ventures, where we can achieve even higher savings than what we can achieve with retail discount agreements.

Tech: Serverless Architecture

Access to Opportunistic or Underutilized Resources

Kinesis Portal abstracts background complexity, allowing workloads to run on most economical resources without any additional considerations in the front end. Most economical resources might come from

Spot Instances:

These are instances that are offered up 90% savings from top cloud providers such as AWS and Azure. The catch is, unlike the dedicated instance, they can be recalled with only 2 minutes notice. A customer would need to handle these interruptions, which is not an insignificant engineering effort. Thankfully, Kinesis Network handles these resource movements seamlessly.

Under-utilized Resources

There is a ton of HW that is underutilized such as

Gamer PCs
Ex-Ethereum Miners
Media production houses with top end graphics cards
Data centers with idle machines on the shelves, awaiting customers (AWS and Azure offer these machines as “spot” instances. many others don’t have the infra to capitalize these resources, so Kinesis can enable such vendors)

Since Kinesis Portal can run workloads seamlessly across the globe, we make use of these resources that is otherwise hard/impossible to reach for regular retail customers.

Pooling Resources Together

Most CPU workloads do not utilize their resources to 100%. In fact, research shows that most workloads are between 1% utilization and 35% utilization. But they pay for the whole machine and for the whole month. With Kinesis’s serverless architecture we can run workloads from multiple sources on the same machine so each machine is much more optimally utilized.

Who will Save Most with Kinesis

Customers who activate the more of above cost saving strategies will enjoy the higher savings. As always, please remember that the total value of Kinesis comes from more than cost savings, so even if we cannot offer cost savings, we can still offer a tremendous value.

Who Benefits Most with CPU and GPU Workloads?

Currently on AWS or Azure, but looking for a similarly respectable cloud provider and looking for savings (in this case, Kinesis can run their workloads on a wider variety of data centers, unlocking seamless savings).
They have a large number of microservices and interactive workloads with average CPU or GPU utilization of less than 50%. Unless they are big number crunchers (for example in the shape of large batch jobs), they will very likely fit this description. This way, instead of paying for the whole month, they will pay only for the CPU/GPU cycles they actually consumed.
They are willing to make long term commitments. This way we can ourselves commit to reserved instances and pass on the savings.
Their workloads are docker images on Ubuntu, and can run on a variety of hardware / location.
Their workload can be divided into smaller chunks and can run asynchronously and independently. Some very large map-reduce and ML training problems are good examples.
GPU specific: Their models can fit into 16GB or 24GB VRAM. This way, we can run their models on much wider variety of hardware, unlocking even more savings.

How Kinesis Scales

Scalability is essential for modern applications, and there are two main approaches to scaling:

Scaling Up Scaling up means increasing the resources allocated to a single server, such as CPU, RAM, or network bandwidth, to enhance the performance of an application. However, individual servers have physical capacity limitations, making infinite scalability impossible. Some workloads (e.g., earlier types of SQL servers) support only single instances, so the only way to improve their performance is by scaling them up.
Scaling Out Scaling out involves deploying the application across multiple servers to distribute the workload. This approach works best when the application has minimal dependency on state, allowing each instance to operate productively in parallel. For example, many modern applications can run efficiently across a large number of smaller instances, providing better scalability and resilience.

The Kinesis Approach to Scalability Kinesis supports both instance-based scaling and serverless workloads, giving customers the flexibility to scale their applications as needed:

Instance-Based Servers: For applications that require dedicated server instances, Kinesis offers scalable instance-based servers to meet performance needs.
Serverless Workloads: The true power of Kinesis lies in its ability to host serverless workloads. Serverless workloads are applications that can scale seamlessly across an arbitrary number of machines without requiring central coordination.

A great example of a serverless workload is REST APIs. When a customer makes a request, any instance in the cloud can handle it. If persistent changes are required, that instance can make modifications centrally (e.g., by writing to a database).

Properties

HTTP serverless

TCP serverless

TCP instance

Typical Workloads

Stateless Request/Response APIs (REST calls, blockchain RPC requests, LLM prompts).

Stateless workloads such as some webservers with their storage somewhere else or idempotent servers such as document servers, ftp servers, time servers, …

Stateful instances such as Database, Storage, Message Queue, Orchestrator, etc.

Number of nodes

Managed by portal (as influenced by customer settings)

Managed by customer

Endpoint Type

Single HTTPS endpoint

Single Domain:Port

Single IP:Port

Endpoint Example

https://<availability zone>.us-west-1.portal.kinesis.network/relay/<private accesskey>

<availability zone>.us-west-1.portal.kinesis.network:44112

34.56.78.90:43223

Transport Layer Security

Always on — endpoint is HTTPS

Optional — can be reached over regular TCP or optionally, over TLS to provide transport layer security as an added benefit

Load Balancing

Each new request hits the portal servers behind that URL. Afterwards it is rerouted / load balanced to one of the available containers unless customer preferred sticky sessions via a cookie.

Each new TCP connection (we don't care about its higher level protocol, could be FTP, could be HTTP) hits the portal servers behind that URL.

N/A - no load balancing. Customer hits their container directly. Suitable for only DC nodes. More performant than HTTPS serverless mode because there is no Portal in thepipeline. But also less redundant because there is no load balancer.

Scale up (provisioining the container with more CPU or RAM)

Automatic

Scale out (provisioning more containers on different servers)

Automatic

Manual. Customer must manually add new servers.

Customer experience after scale out

Nothing changes in terms of access points, customer still calls the original endpoint, which automatically makes the best use of backing containers.

They get a new endpoint. Akash supports only this mode. It is up to the customer to call this new endpoint.

Support for Global Nodes

Yes, supports both DC and Global Nodes.

No, only DC nodes. While global nodes can run them, customers wouldn't want them, because who relies on unreliable global nodes with their stateful workloads.

Security

is cryptographically unguessable. As long as customer keeps the access key private, no one else can use this endpoint.

Domain names and IP addresses are public. Attackers can connect to these endpoints. Containers must provide their own authentications (similar to db servers, file servers, printers, etc.)

Access

FW to allow/block source IPs.

FW to allow/block source and destination IPs

Log Collection

Logs are collected from all nodes (typically at least two). Customer can filter per date, and logs will show what each server has sent them.

Logs are collected from all nodes (well, typically only 1). Customer can filter per date, and logs will show what each server has sent them.

Discussion:

For TCP connections, the endpoints look something like platform1.us-east.kinesis.network:41883. Isn't this a blatant security/privacy issue? Don’t we need to generate subdomains with enough characters to get some sort of security via obscurity, like Akash with connection strings like 8vg98dm2mhahjfmsr063a4d5t8.ingress.d3akash.cloud

TCP is a basic Layer 4 transport protocol, hence doesn’t have any inherent access controls. This is why TCP applications bring their own access control (e.g. internet standards like Telnet, SMTP, FTP, SSH, HTTP, and proprietary formats such as RDP, SMB, MongoDB, etc.). Anyone can connect the TCP socket (i.e. have SYN-ACK handshake) but the server will demand proper authentication in that protocol or they will terminate the connection. That’s is simply how TCP apps work.

What security tools can we offer to our customers?

We will support firewalls, so authenticated or not, they can reduce the attack surface dramatically. Authenticating requests at application layer still requires code (any code can be vulnerable), and malicious connections can consume server resources like memory (especially kernel memory). FW can reduce the attack surface from worldwide adversaries to a much smaller group.
We will support TLS connections (which operates at Layers 4 to 7, hence bringing new features). TLS is a widely supported industry standard. TLS provides two benefits: 1) it allows server and client authentication via certificates. In this case, a client wouldn’t even run any of the app code with an unknown client. This is safer because it is another layer of protection and such common code paths in OS are typically more hardened than individual apps. 2) it allows transport layer security, which provides privacy and protection against tampering, replay attacks, etc.

Can cryptographically unguessable domain names be a form of security?

Unfortunately no, because DNS records are public and can be discovered by commonly available tools. Even if they weren’t, the backing IP addresses are known, and can be crawled and scanned.

Can we do more?

We could introduce a client side SDK that takes over portions of authentication and transport security. I am hesitant because managing a multi-platform SDK is not easy.

How does Joule enable Kinesis Network?

Global, Pay-per-the-Second Billing

First, our platform provides a fully-managed, serverless compute cloud, with billing based on highly granular metering—down to the millisecond. For instance, if a customer’s workload runs for 2.32 seconds on a crowdsourced hardware and 4.19 seconds on a datacenter hardware, we meter usage down to the millisecond and compensate suppliers precisely for the time their hardware is utilized. Given the scale of some workloads, this often involves thousands of globally distributed machines. The coin’s primary utility is to facilitate these global micropayments instantly and frictionlessly.

Fungible Nature

Our product architecture is fully serverless, meaning applications run across a dynamic "cloud" rather than on specific, dedicated machines (as some other adjacent projects require). If a hardware instance goes down, our managed services detect the issue and immediately failover the workload to a healthy machine (that’s already warmed up and ready to go), ensuring seamless continuity of service from customers' POV. This fungible nature of underlying hardware (where any machine can replace another) makes the coin essential for compensating the owners of this continuously shifting, diverse hardware network.

Tight Integration with the Protocol

Furthermore, the coin is tightly coupled with the Kinesis protocol. At its core, the protocol enables the exchange of CPU/GPU time for JOULE coins in a trustless environment. All of our managed services (such as heartbeat monitoring, failover, capacity management, conversion of customer payments from USD to JOULE, load balancing, etc.) are built on top of this protocol. While managed services building on the protocol could theoretically operate without a coin, the protocol itself cannot, as the coin underpins its functionality.

Staking

A global trustless ecosystem requires financially motivated actors. Staking requirements work as security deposits, ensuring that all actors bring their A game or else they pay fines with their staked Joule coins.

How Does Kinesis Network Quickly On-Ramp the Supply

Challenges of Retail Staking

DePIN networks are permissionless and trustless. Since node operators are not vetted for trustworthiness or reliability beforehand, a security deposit is required to ensure proper behavior. This deposit, known as a stake, involves placing a certain amount of the network’s native tokens into escrow. If a node misbehaves, its stake can be slashed. This financial incentive helps maintain a high level of service reliability (SLA) and assures the network’s users that nodes are acting in good faith.

In practice, however, acquiring and staking the platform’s tokens can be challenging for many retail participants. Consider the steps a supply node operator might need to navigate:

Transfer the native tokens to the nodes address. Which depends on...
Find an exchange or OTC market that lists the token and purchase it with another digital currency (e.g., USDT). Which depends on...
To obtain digital currency easily, open a CEX (centralized exchange) account and fund it by linking a bank account. Which depends on...
Ensure the bank account has sufficient funds.
Trust the chosen CEX, operate within a supported country, and possibly convert from the local currency to USD. Which depends on...
Have the money ready. Security deposits are designed to be large enough to dissuade bad actors, meaning substantial capital is required, even for those simply interested in donating compute resources rather than making money.

As you can see, safely and efficiently scaling the supply side of the network is a significant challenge. Kinesis Network introduces a novel approach to overcome these hurdles and enable global, viral-level supply growth while preserving the financial incentives that encourage nodes to behave well.

Introducing Investor Pools

Investors of all sizes can contribute to “investor pools.” Supply node operators can "borrow" from these pools to come online without any friction. Consider the experience for a typical gamer:

Run the installer and accept the default settings.
When prompted, the user can choose to use their own Joule (the platform’s native token) or simply borrow from the investor pool.
That’s it! The node is online, with no need for a bank account, CEX account, or complicated transfers. The entire process takes just a few minutes.

While there is a DAO-run reference implementation, many such pools can coexist and compete with each other.

But What About Security?

If nodes borrow their security deposit, what ensures they behave well without having immediate “skin in the game”? Kinesis Network’s smart contracts handle this elegantly:

As the node starts earning rewards, it gradually repays the borrowed stake.
As repayments occur, the node’s borrowed stake is converted into its own earned stake.
From the moment it earns any Joule, the node begins building both its stake and its reputation.
With more stake (and thus a stronger reputation), the node receives more invitations to participate in network sessions, creating a virtuous cycle.

Nodes cannot receive unstaked payouts until they reach a minimum amount of self-earned stake. Likewise, they cannot withdraw their stake until they reach a certain threshold. This ensures that even borrowed participants are financially motivated to behave properly in the long run.

Win-Win-Win for Investors

Win #1: Rapidly increasing the supply of nodes enhances the network’s overall value, improving token utility and strengthening the ecosystem.
Win #2: Borrowed stakes earn interest for investors. The interest rate is managed by pool operators. For example, the DAO pool might adjust interest rates in response to supply levels, or other pool operators could differentiate themselves by offering lower interest rates or special incentives.
Win #3: Partners—such as gaming companies, hardware manufacturers, and influencers—can establish their own investor pools. By directing their audience to use their pools (for example, through custom installation links or specialized software versions), they can earn interest and exert more influence over the nodes they bring into the network. This model is akin to affiliate marketing, creating opportunities for aligned incentives and shared growth.

In summary, Kinesis Network’s approach to investor pools addresses the key challenges of retail staking by simplifying access, ensuring proper incentives, and creating a system in which all parties—investors, node operators, and the broader network—can benefit.

Blog

DeepSeek & AI Compute: Disruption or Evolution?

DeepSeek’s latest AI breakthrough has stirred up conversations across the industry. Some call it a game-changer; others compare it to AI’s "Sputnik moment." The reaction has been swift—not just in AI circles but across the financial markets, with over $600 billion wiped out from major tech stocks, including Nvidia, Microsoft, Alphabet, and Amazon.

Nvidia, in particular, suffered the biggest single-day loss in stock market history, as investors reevaluated expectations about AI compute demand. The market panic stemmed from claims that DeepSeek’s model delivers comparable performance at a fraction of the compute cost, disrupting assumptions about the infrastructure requirements of cutting-edge AI.

But beyond the headlines and market volatility, what does this actually mean for AI infrastructure and compute efficiency?

Let’s break down what’s happening and what it means for the future of AI compute.

DeepSeek’s AI Training Efficiency: Impressive, But Not a Revolution

DeepSeek trained its 671B parameter model using 2,048 GPUs over 57 days, totaling ~2.78 million GPU hours. That’s an efficiency win compared to industry norms, but it doesn’t fundamentally change the compute landscape.

Key takeaways:

DeepSeek still required massive computational power. Their process optimized GPU utilization, but did not eliminate the need for high-performance hardware.
This is an optimization, not a paradigm shift. AI training remains an expensive, compute-heavy process, even with efficiency improvements.
DeepSeek leveraged Nvidia GPUs—highlighting that current AI breakthroughs are still tied to the same core hardware ecosystem.

The Market Reacts: AI Disruption & the Compute Landscape Shift

Following DeepSeek’s announcement, the stock market saw a sharp reaction, with over $600 billion wiped out from major tech stocks including Nvidia, Microsoft, Alphabet, and Amazon. Investors scrambled to reassess expectations about AI infrastructure needs and whether more efficient models could disrupt existing business models. However, history tells us that market shocks often overcorrect, and long-term compute demand remains resilient. AI adoption continues to expand, and while efficiency gains shift expectations, the need for scalable, cost-effective compute infrastructure is only growing.

More AI models, smarter architectures, and cost-efficient optimizations = Higher compute demand, not less.
Nvidia will likely shift focus to inference acceleration.
Decentralized compute solutions will become increasingly relevant.

This is not a crisis for the compute industry—it’s an evolution.

The Real Cost in AI: Inference, Not Training

While training is resource-intensive, inference is the real long-term bottleneck.

Most AI companies do not train foundational models; they fine-tune existing ones.
Training is typically a one-time or infrequent cost, whereas inference scales linearly with usage.
Compute demand will continue to rise as AI adoption expands into real-world applications requiring continuous inference.

DeepSeek does not change this equation—it simply highlights that optimization at every stage of AI compute is crucial.

The Rise of Decentralized Compute

One of the most overlooked impacts of AI efficiency improvements is the role of decentralized compute networks.

More efficient models fit better on smaller, distributed infrastructure.
Gamer PCs, idle enterprise servers, and decentralized nodes can now play a bigger role in AI processing.
Hyperscalers will remain dominant, but AI is no longer exclusive to massive centralized data centers.

This shift is great news for the entire AI ecosystem, including the open-source AI movement. With DeepSeek making its model freely available (unlike GPT-4o), this raises new questions about the role of proprietary vs. open AI. Open-source AI could further fuel decentralized compute adoption, as companies look for cost-effective, flexible infrastructure alternatives to closed AI models —from startups building AI-driven products to enterprises integrating AI into their workflows, and decentralized compute providers enabling more efficient infrastructure. By reducing the reliance on hyperscalers and enabling more flexible compute access, this trend will lower costs and expand opportunities for AI adoption across industries.

Global AI Investment, Geopolitics & the Acceleration of Innovation

DeepSeek proves one thing: the AI race is heating up, and AI is now a geopolitical battleground. The U.S. has imposed export controls on advanced AI chips to China, while China continues to make strides in AI research and alternative chip development. This competition will not only accelerate AI investments but also shape enterprise adoption strategies worldwide.

China’s AI progress will likely accelerate global AI investments.
The focus is shifting from raw compute power to cost-efficient, scalable AI infrastructure.

As AI compute becomes more efficient and widely accessible, innovation will accelerate. Startups will have lower barriers to entry, enterprises will be able to experiment with AI integrations more affordably, and decentralized compute networks will expand their role in supporting AI workloads. This democratization of AI infrastructure will lead to new breakthroughs in model development, fine-tuning, and application deployment.

What Comes Next? AI Compute Pricing & Sustainability

The AI industry is moving toward a new phase where efficiency, scalability, and decentralization define success. But there’s another important factor—AI compute pricing and sustainability. If DeepSeek’s efficiency claims hold, will AI compute pricing come under pressure? Hyperscalers may respond by adjusting their pricing models, while decentralized compute providers could offer cost-competitive alternatives. Meanwhile, as AI models scale, energy consumption concerns grow—creating an opportunity for more sustainable, decentralized AI compute solutions. Key trends to watch:

LLMs are becoming commodities—data ownership and enterprise adoption will be the real differentiators.
Nvidia and other hardware players will double down on inference-focused chips.
Decentralized compute networks will continue gaining traction as AI models become more efficient.

Final Thoughts: The Future of AI Compute

DeepSeek’s efficiency improvements are valuable, but they do not eliminate the need for massive compute power. Instead, they mark the beginning of a larger shift—one where geopolitics, open-source AI, pricing, and sustainability will shape the future of AI compute. They reinforce the importance of optimization, inference efficiency, and scalable infrastructure—areas where decentralized compute can shine.

At Kinesis Network, we’re building the future of AI compute—one that is scalable, cost-effective, and decentralized. As AI models evolve, so must the infrastructure that powers them. The future of AI isn’t just about who builds the biggest model—it’s about who can run them the smartest.

IEEE: The AI Boom Is Giving Rise to "GPU-as-a-Service

The industry harvests idle compute for AI startups that need it

The biggest advantage of GPUaaS is economical. By removing the need to purchase and maintain the physical infrastructure, it allows companies to avoid investing in servers and IT management, and to instead put their resources toward improving their own deep learning, large language, and large vision models. It also lets customers pay for the exact amount of GPUs they use, saving the costs of the inevitable idle compute that would come with their own servers.

“Industry leaders are deeply committed to sustainability,” Khimani says. “With the focus on innovation and efficiency, they can optimize existing computing power that is already active and consuming energy, rather than continually adding more servers for every new application they run.”

Kinesis Network and Multiverse Computing Unite to Redefine AI Optimization

SEATTLE, WA, UNITED STATES, January 16, 2025

The Fundamentals of Web3: Revolutionizing the Internet with Ownership and Community

What is Web3?

Web3 is a multi-faceted term that encompasses a broad range of technologies and ideas aimed at decentralizing the internet. To better understand it, let’s look at how Web3 builds on its predecessors—Web1 and Web2—while advancing a new vision for how people interact online, with more user control over data, identity, and assets.

Web1 was the first phase of the internet, characterized by static, read-only web pages. Web2, which followed, introduced interactive, dynamic experiences but brought about centralized control by large platforms. In contrast, Web3 seeks to create a decentralized internet, where blockchain technology enables users to interact directly, without intermediaries, and with enhanced control over their data and digital presence.

Web3 envisions a trustless, permissionless internet where digital assets, data, and identities are directly controlled by individuals. This model is often described as the ownership economy, enabling users to own and control their data and digital assets through technologies like decentralized finance (DeFi) and non-fungible tokens (NFTs). Additionally, it promotes an interoperable and user-centric web, allowing for seamless interactions across platforms while empowering users to participate in decision-making processes via decentralized governance.

The Evolution of the Web

Web1 – The “Read-Only” Web: Early websites displayed static content that users could only read or view.
Web2 – The “Read-Write” Web: Enabled dynamic content and interactivity, which allowed users to participate in social networks, online marketplaces, and more.
Web3 – The “Read-Write-Own” Web: Empowers users to own, control, and monetize their data, assets, and digital presence.

By building on blockchain and decentralized technologies, Web3 has the potential to redefine industries and reshape the internet as we know it.

Core Principles of Web3

Decentralization: Unlike Web2 platforms that are owned and operated by central entities, Web3 applications are often decentralized, distributing control across participants rather than a single authority.
User Sovereignty: Web3 gives users control over their digital assets, data, and identity. This is often achieved through cryptographic wallets, where only the user has access.
Trustless Transactions: Transactions on Web3 platforms don’t require third-party intermediaries. Instead, smart contracts—self-executing agreements on the blockchain—enable secure peer-to-peer interactions.
Interoperability: Web3 promotes interoperability, meaning different applications and platforms can connect and share information seamlessly. This allows users to move digital assets and data across ecosystems without friction.
Transparency: Since blockchain transactions are publicly recorded, Web3 applications are inherently transparent, allowing users to verify the validity of transactions and governance activities.

Key Sectors in Web3

Web3 spans a range of sectors, each with distinct applications and use cases. Here are a few examples that highlight the diversity of Web3’s potential:

DeFi (Decentralized Finance): Offers financial services without intermediaries, allowing for peer-to-peer lending, borrowing, and trading on decentralized platforms.
DePIN (Decentralized Physical Infrastructure Networks): Powers real-world infrastructure like telecommunications, cloud storage, and compute networks by incentivizing resource sharing.
NFTs (Non-Fungible Tokens): Enables verifiable ownership of digital assets such as art, music, collectibles, and real estate in the digital space.
Social & Creator Economies: Empowers creators to monetize their work directly through tokenized communities, where fans and followers gain ownership in the creator's success.

These sectors highlight Web3’s versatility, reshaping traditional industries from finance and media to infrastructure and beyond.

Community-Driven Funding and Ownership

One of the defining features of Web3 is its approach to funding and ownership. Unlike Web2 companies, which often rely on centralized funding from venture capital, Web3 projects tend to involve the community early on. This community-first model allows for early user participation, giving community members a vested interest in a project’s success and fostering a collaborative environment.

Utility tokens are frequently used in Web3 to provide holders access to specific functions within the ecosystem, rather than being a form of ownership. These tokens often serve as governance rights, staking assets, or utility credits, allowing holders to interact with the network while aligning with the project’s purpose. For instance, community staking or participation-based incentives encourage early adopters to actively engage with the project without relying on speculative token sales. This decentralized approach to funding and ownership enables community members to directly support the project’s growth, fostering a more equitable ecosystem.

New Business Models in Web3

In addition to technical advancements, Web3 introduces innovative business models that change how value is created and shared:

Token Economies: Tokenized ecosystems incentivize participation by rewarding users with tokens for their contributions.
DAOs (Decentralized Autonomous Organizations): Enable community governance by allowing token holders to vote on key decisions, making the community part-owners of the project.
Play-to-Earn and Participate-to-Earn: These models reward users for engaging with the ecosystem, be it through gaming, community participation, or content creation.
Data Sovereignty: Web3 allows users to retain control over their personal data, with some models allowing users to earn from their own data.

These models challenge traditional approaches by prioritizing community involvement, shared ownership, and equitable value distribution, offering a fresh approach to how digital platforms operate.

Looking Ahead

Web3 represents a paradigm shift that goes beyond mere technological advancements. It fosters a new relationship between users, platforms, and digital assets, empowering individuals to take control of their data and assets while participating in decentralized economies. As Web3 continues to evolve, it will likely bring about a wave of innovation across sectors and industries, presenting new opportunities and challenges. Embracing Web3’s principles offers a glimpse into a future internet that is more decentralized, transparent, and community-oriented, with significant implications for society and the global economy.

Kinesis Network Saves Costs

In recent years, serverless computing has emerged as a transformative approach to running applications and workloads in the cloud. This model shifts responsibility for infrastructure provisioning and management from the end user to the cloud provider. In this post, we are exploring how Kinesis Network with its serverless architecture saves costs.

1. Over-Provisioning and Idle Capacity

One of the most notable causes of waste in an instance-based computing model is over-provisioning. When organizations provision specific virtual machines or containers, they frequently allocate more resources—CPU, memory, or storage—than the application actively needs. This over-allocation often aims to ensure enough capacity is available for peak usage, even if those peaks only occur sporadically. Unfortunately, during low-traffic intervals, these resources remain largely underutilized, creating wasted capacity. Our experience has shown us that a large portion of instances see less than 20% utilization on average, sometimes as bad as only 1%. In contrast, a serverless environment scales automatically based on demand. The Kinesis Network spins up or tears down resources as needed, eliminating over-provisioning and shrinking the idle footprint significantly.

2. Pay-For-Idle vs. Pay-Per-Use

Hand-in-hand with over-provisioning is the pay-for-idle model inherent in instance-based services. Organizations pay for entire virtual machines regardless of whether they are actively processing tasks or sitting idle. This can be a significant cost drain for applications with unpredictable traffic or infrequent usage patterns. By contrast, Kinesis services follow a pay-per-use billing model. This means that costs accrue only when the application processes requests. With no fixed cost for idle time, developers can benefit from substantial cost savings, making serverless the more financially efficient option.

3. Operational Complexity

In an instance-based model, DevOps teams are responsible for provisioning, configuring, patching, and maintaining virtual machines. This operational complexity often leads to less predictable results and potential inefficiencies. Each instance must be monitored, and scaling must be managed carefully. This level of manual oversight not only consumes time and resources but also increases the likelihood of human error. In our serverless architecture, Kinesis Network handles most of these operational responsibilities—provisioning, scaling, fault tolerance, and more. The result is less overhead in terms of both personnel and budget, leading to a more focused environment where developers can prioritize core business logic.

4. Environmental Impact

The wasteful nature of running underutilized or idle virtual machines has broader implications beyond cost. Modern data centers require substantial energy resources to power servers and maintain cooling systems. When capacity is over-allocated, these underutilized servers continue to consume power, leaving a significant carbon footprint. Kinesis Network, with its on-demand approach, reduces total operating hours of hardware. Because Kinesis Network allocate resources dynamically, computing resources remain dormant until needed, cutting down on energy usage and subsequent environmental impact.

5. Flexibility and Agility

From a software development perspective, serverless computing enables agile development. Functions can be deployed quickly without detailed infrastructure management, allowing teams to iterate faster. In an instance-based setup, teams must often navigate lengthy processes to provision and configure additional instances or adjust the size of existing ones. This overhead can stifle innovation and extend deployment cycles. In contrast, Kinesis Network streamlines these processes, ensuring teams can rapidly experiment, test, and roll out new features with fewer constraints—and less waste.

Despite these advantages, it's important to note that serverless computing isn't a silver bullet for all workloads. Long-running processes or applications with consistent, predictable loads might still benefit from instance-based deployments. However, for the vast majority of modern applications with variable workloads, serverless architectures offer a more environmentally sustainable approach to cloud computing.

Powerlifting with Kinesis Network

Kinesis Network is designed to be flexible and multi-purpose. However, some workloads stand out to benefit most from the capabilities of Kinesis Network: Applications that often rely on parallel computation (splitting large tasks into smaller ones) and optimized hardware acceleration.

Below are some common examples of workloads or applications that tend to push CPUs and GPUs to their limits:

1. AI & Machine Learning

Neural Network Training

What It Is: Training large-scale models (e.g., convolutional neural networks for image recognition, transformers for language tasks) involves iterating over massive datasets and performing billions of floating-point operations.
Why It’s Intensive: Each forward and backward pass can update millions or even billions of parameters. GPUs excel at this kind of parallelizable matrix multiplication.
Examples:
- Image Classification (ResNet, EfficientNet) on platforms like ImageNet.
- Large Language Models (GPT-style), which can have hundreds of billions of parameters.
- Recommendation Systems at companies like Netflix or YouTube, which process user activity logs in real time to update models.

Inference & Real-Time Prediction

What It Is: Once models are trained, they need to make predictions on new data quickly.
Why It’s Intensive: High-traffic systems (like voice assistants, search engines, or real-time translation services) can receive millions of queries per second. Optimizing inference—often on GPUs or specialized hardware (FPGAs, TPUs)—is critical for low latency.
Examples:
- Virtual Assistants (Possible alternatives to Siri, Alexa, Google Assistant).
- Security Systems (Real time analysis of security footage, pattern matching)

Reinforcement Learning & Robotics

What It Is: Training agents to interact with environments (e.g., game playing, industrial robots).
Why It’s Intensive: Simulation-based training (like AlphaGo/AlphaZero) can require playing millions of matches or environment steps.
Examples:
- Gaming engines like Go or Chess playing against themselves.
- Industrial Robotics where digital twins simulate thousands of robotic arm movements to optimize tasks.

2. Scientific Simulations & High-Performance Computing (HPC)

Weather Forecasting & Climate Modeling

What It Is: Simulations of the Earth’s atmosphere, oceans, and land processes.
Why It’s Intensive: These models involve solving partial differential equations across 3D grids that can contain billions of cells. Tiny time steps are used for accuracy, leading to large computational workloads.
Examples:
- Weather forecasting research, hurricane simulations, early warning systems.

Computational Fluid Dynamics (CFD)

What It Is: Numerical analysis and data structures to analyze fluid flows—key in engineering (aerospace, automotive).
Why It’s Intensive: Accurate CFD often requires extremely fine meshes or grids to capture turbulence and boundary layers, resulting in massive numerical calculations.
Examples:
- Aircraft Design
- Vehicle Aerodynamics (simulating airflow around cars for optimizing fuel economy).

Astrophysics & Cosmology

What It Is: Simulating large-scale structures in the universe (e.g., galaxy formation), star evolution, black holes, and gravitational waves.
Why It’s Intensive: Interactions among billions of particles or elements, along with complex physical laws (general relativity, plasma physics), make these simulations extremely heavy.
Examples:
- Simulations of Galaxy Clusters by universities and research institutes.
- Studying Black Hole Mergers

Nuclear & Particle Physics

What It Is: Modeling subatomic particle interactions, nuclear reactor cores, or accelerator experiments.
Why It’s Intensive: Requires quantum-level physics, Monte Carlo methods, and/or large-scale iterative solvers.
Examples:
- Simulating particle collisions
- Fusion reactor simulations

3. Visual Effects and 3D Rendering

Film & Animation Rendering

What It Is: Creating photorealistic images and animations.
Why It’s Intensive: Global illumination, ray tracing, and advanced physics-based lighting models require evaluating complex mathematical functions for each pixel. Frames can take hours each to render at high quality.
Examples:
- Production of feature films.
- Blender’s Cycles (open-source) for CPU/GPU path tracing.

4. Protein Folding & Other Bioinformatics

Protein Structure Prediction

What It Is: Determining how a protein’s amino acid chain folds into a 3D structure—crucial for understanding biological functions and designing drugs.
Why It’s Intensive: The potential configuration space is astronomically large. Advanced methods (like AlphaFold) use deep learning models that require significant GPU resources.
Examples:
- Predicting structures for nearly all known proteins.
- Volunteer computing for protein folding research.

Genome Sequencing & Assembly

What It Is: Processing raw sequencing reads to reconstruct whole genomes (e.g., human, plant, bacterial).
Why It’s Intensive: Datasets can easily reach terabytes in size. Algorithms like de novo assembly or alignment-based methods (e.g., Bowtie, BLAST) require large HPC clusters.
Examples:
- Large-scale sequencing projects (e.g., 1000 Genomes, Cancer Genomics).
- Metagenomic Studies analyzing entire microbial communities.

Molecular Dynamics Simulations

What It Is: Simulating the movement of atoms in molecules or complexes over time.
Why It’s Intensive: Calculating forces and interactions at each step for millions of atoms demands CPU/GPU acceleration (e.g., GROMACS, NAMD).
Examples:
- Drug Discovery (predicting how small molecules bind to protein targets).
- Basic Biophysics Research on membrane channels or virus capsids.

5. Big Data Analytics & Data Processing

Distributed Computing Frameworks

What It Is: Systems like Apache Hadoop and Spark split massive datasets across many nodes for parallel processing.
Why It’s Intensive: Operations like sorting, aggregating, or joining large tables can involve scanning petabytes of data.
Examples:
- ETL Pipelines for enterprise data lakes.
- Social Media Analytics at companies like Twitter or LinkedIn, which process billions of events daily.

Graph Processing

What It Is: Analyzing node-link structures to find patterns (e.g., community detection, shortest paths, graph embeddings).
Why It’s Intensive: Graph algorithms can be complex (e.g., PageRank), with large real-world graphs (millions or billions of nodes/edges).
Examples:
- Social Graph for friend recommendations.

Real-Time Stream Processing

What It Is: Handling data that arrives continuously (logs, sensor data, click streams) for immediate analytics and alerts.
Why It’s Intensive: Requires fast ingestion, transformations, and real-time dashboards. Latency constraints necessitate efficient CPU/GPU usage.
Examples:
- Financial Tick Data in high-frequency trading.
- IoT Sensor Streams in factories or connected devices.

6. Cryptography & Security

Encryption / Decryption at Scale

What It Is: Securing large volumes of data in motion (TLS/SSL connections) and at rest (disk encryption).
Why It’s Intensive: Bulk operations on huge data sets, but modern CPU instruction sets (AES-NI) and hardware accelerators help.
Examples:
- VPN Gateways handling encrypted connections for thousands of users.

Password Cracking & Security Auditing

What It Is: Testing password strength by trying many possibilities (brute force) or using dictionary-based attacks.
Why It’s Intensive: GPU-acceleration (e.g., via Hashcat) can test billions of hashes per second.
Examples:
- Penetration Testing for corporate security.
- Law Enforcement accessing encrypted devices with court authorization.

7. Financial Modeling & Quantitative Analysis

Monte Carlo Simulations

What It Is: Statistical simulations for risk assessment, derivative pricing (e.g., options, bonds), and portfolio optimization.
Why It’s Intensive: Accurate results often require millions of iterations, each involving complex financial models.
Examples:
- Derivative Pricing of complex instruments (e.g., exotic options).
- Value at Risk (VaR) calculations across large portfolios in investment banks.

High-Frequency Trading (HFT)

What It Is: Automated trading strategies that react to market changes in microseconds or nanoseconds.
Why It’s Intensive: The time factor is critical. Firms invest in specialized HPC clusters, FPGAs, or ASICs to reduce latency.
Examples:
- Quant Funds (Renaissance Technologies, Two Sigma) using large computing clusters.
- Market Making requiring real-time price updates and predictive models.

8. Computer-Aided Design & Engineering (CAD/CAE)

Finite Element Analysis (FEA)

What It Is: Breaking down complex structures into smaller elements to analyze stress, strain, heat transfer, etc.
Why It’s Intensive: Large models with fine meshes require iterative solvers and matrix operations. Parallel processing across CPU cores or GPUs is often used.
Examples:
- Automotive Crash Simulations (ANSYS, LS-DYNA).
- Aerospace Structural Analysis

Generative Design & Topology Optimization

What It Is: Algorithms that iteratively suggest new designs based on performance goals (weight, strength, efficiency).
Why It’s Intensive: Each iteration requires an analysis, which is repeated dozens or hundreds of times.
Examples:
- Light weighting automotive parts for better fuel efficiency.
- Architectural Design for optimized building layouts (Autodesk’s Generative Design).

9. Video Encoding & Transcoding

High-Resolution Video (4K/8K)

What It Is: Encoding large video files for streaming or storage using codecs like H.264, HEVC, VP9, AV1.
Why It’s Intensive: Each frame undergoes complex compression algorithms. More pixels (4K/8K) = more data. Real-time or batch encoding at scale can stress CPU/GPU clusters.
Examples:
- Media encoding to optimize storage and delivery for various screen sizes.

10. Real-Time Simulation & Digital Twins

Smart City & Factory Simulations

What It Is: Digital replicas of physical environments (e.g., a production line, traffic network) coupled with real-time sensor data.
Why It’s Intensive: Simulating thousands of moving parts, IoT data streams, and complex event processing requires HPC-level resources.
Examples:
- Digital twins for factory floors, integrating robotics and sensor feedback.
- Urban Planning tools simulating traffic flow, infrastructure loads, and environmental impact.

Automotive & Aerospace Digital Twins

What It Is: Real-time virtual counterparts of vehicles or aircraft to test system updates, maintenance, and design changes.
Why It’s Intensive: Must incorporate physics, AI-based control systems, and potentially large sensor datasets from the real-world counterpart.
Examples:
- Sports teams running simulations during races for strategy.