🧠 AIGr.id: An Open, Plural, and Polycentric AI Network

AIGrid represents a fundamental shift from siloed, monolithic AI to an open, plural, and networked AI ecosystem.

🌐 AIGr.id is a polycentric network of independent, plural AI components that coordinate to perform tasks, exchange data, and compose into higher-level intelligence — broader and greater than the sum of its parts.

🧩 Designed as global public infrastructure for AI, AIGrid is not owned or controlled by any single entity. It is contributed to and accessed as a digital commons — intelligence built by many, for all.

⚙️ Powered by OpenOS.AI (AIOS) — A distributed AI Operating System for open, plural and poly-centric AI networks.

OpenOS.AI is 100% open source and community-driven.

📘 Product Deep Dive: Discover the philosophy, design, strategy, and purpose behind this platform in our in-depth write-up.
👉 Read the full write-up

While a variant of AIGr.id has been running in production at nearly 500k inferences per second on bare metal infrastructure in federated setting for close to a year—supporting real-time, sustainable Vision AI workloads—the released version includes significant upgrades to support broader goals, including LLM integration. Although unit and integration tested, this version has not yet been validated at similar scale or duration. As such, the project remains in beta and is not recommended for production use at this time.

Heads-up: Testnet is tentatively scheduled for release in the first week of May 2025. For information on upcoming events and roadmap tasks, refer to this section.

OpenOS.AI provides full stack AI operations, globally distributed and optimized AI compute scale platform, and data management for decentralized AI networks.

OpenOS.AI enables:

🧠 Creation and coordination of multiple cognitive architectures
🔗 Composition of modular, networked AI systems
⚙️ Dynamic orchestration of distributed AI agents and services
☁️ Optimized, cloud-native, and sovereign AI computing at scale
🎛️ Actor-controlled resource allocation across shared infrastructure
🤝 Shareable AI, compute and data as digital commons and gig-economy for AI
🗂️ Distributed data management and flow at massive scale
🏛️ Polycentric governance and programmable autonomy
🌐 Open, multiplayer AI production and distribution at a global scale
📊 System-wide observability, behavior tracing, and telemetry

🚀 Core Features Overview

💡 Feature	📘 Description
🔗 Unified Network-Level, Multi-Cluster Resource Pooling	Seamlessly connect Kubernetes clusters from different locations to form a unified resource pool for running any kind of computational workload.
⚙️ Flexible Resource allocation & Scheduling	Schedule AI models (LLMs), general compute logic, or custom Blocks on any cluster. Includes customizable scaling, load balancing, and health checks.
🛡️ Policy-Driven Infrastructure and Job Management	Govern infrastructure and workloads using Python-based policies for full control over network, cluster, and job behavior.
🔄 Distributed Graph Execution (vDAGs)	Define complex workflows as DAGs of Blocks, allowing distributed execution across nodes and clusters.
🧠 Model Splitting and Distributed Inference	Break large models (like LLMs) into smaller splits, deploy them as vDAGs across infrastructure, framework-agnostic.
🧰 Developer-Friendly SDKs	Use SDKs to easily write and deploy AI model servers or compute logic across the distributed network.
🧩 Third-Party Framework Integration	Bring your own stack—wrap existing frameworks and libraries as Blocks using init containers.
🧪 Multiple Instance Execution / GPU Sharing	Run multiple Block instances on the same node; a single GPU can be time-shared across multiple instances for maximum efficiency.
📝 Customizable Parser-Based Workload Definitions	Define workloads using flexible, pluggable parsers to support different input formats and metadata structures.
📊 Policy-Based Load Balancing and Health Checks	Use policy logic to drive runtime decisions for load balancing, instance health, and failover handling.

🧩 Breakdown of Features

Global Cluster Networking
Easily connect Kubernetes clusters across regions, forming a globally distributed, policy-governed compute mesh.
Node Onboarding
Add VMs or bare-metal nodes to any cluster within the network, enabling flexible infrastructure expansion.
Custom Rule-Based Orchestration
Write Python policies to control how clusters and networks are formed and how workloads are scheduled, tailored to your specific operational needs.
Python-Native Policy Engine
Policies are written in Python, offering high expressiveness and support for external libraries, enabling complex logic and integration.
Flexible Policy Deployment Modes
Deploy policies as standalone services, ad hoc jobs, or policy graphs, depending on the use case.
Decentralized Registries
Set up and register your own asset or container registries on any cluster. These registries are globally discoverable and shareable within the network.
Block and vDAG Specification via SDKs
Define compute workloads (e.g., LLM inference, object detection, etc.) using the Python SDK. Compose them into vDAGs to form cross-node or cross-cluster workflows. Blocks can be reused across multiple vDAGs.
Sidecar Extensions for Blocks
Extend the functionality of Blocks through customizable sidecar containers.
Resource-Aware Scheduling
Use policies to control resource allocation, auto-scaling, and load balancing. Blocks can scale across nodes and utilize multiple GPUs as needed.
GPU Sharing Across Blocks
Schedule multiple Block instances on the same GPU for efficient resource utilization.
End-to-End Metrics Collection
Collect metrics from Blocks, vDAGs, and nodes. Use them in policy logic for decision-making or define custom metrics as needed.
Policy-Based Auditing and Quotas
Apply policies for vDAG-level audit logging, access controls, and quota management.
Custom Health Checks
Define health check logic using policies for fine-grained monitoring.
gRPC-Based Inference APIs
Submit tasks to Blocks or vDAGs via gRPC-based inference servers.
Multi-Gateway Inference Support
Any user or administrator can deploy their own inference server and register it in a public directory. Each server can enforce its own policies for quotas and access control.
Customizable Specification Format
Define and extend the specification format for onboarding clusters, nodes, Blocks, and vDAGs. Use policies to build custom specification parsers.
Reusable Specification Store
Browse, search, and reuse predefined or customized specifications to quickly deploy Blocks and vDAGs.
Third-Party System Integration
Seamlessly extend Blocks with third-party services or tools, either deployed alongside or externally, automated via init containers.
LLM Splitting and Reusability
Split large LLMs into modular components and distribute them as vDAGs. Each model chunk can be reused across multiple vDAGs, enabling scalable and efficient deployments.

🚀 Getting Started

🧩 Essentials
Paper
Concepts
Architecture
🧭 User Flow Guides
Network Creator & Admin Flow
Cluster Contributor & Admin Flow
Node Contributor Flow
Block Creator Flow
vDAG Creator Flow
End User (Inference Task Submitter) Flow
⚙️ Installation
Network Creation
Onboarding Cluster
Onboarding Node to a Cluster

🧪 Quickstart Tutorial:

The quickstart tutorial explains how to:

📄 Articles

🤖 AIOSv1 vs Anyscale
⚡ AIOSv1 vs NVIDIA Dynamo

📅 Upcoming Activities

Benchmarking and Performance Analysis: Evaluation of system services, cluster services, block components, and end-to-end benchmarking of popular LLM and non-LLM models on the platform.
(To submit or suggest a model for benchmarking, please open an issue.) – To be announced.
Mainnet Release: Launch of the mainnet, supporting both public and private deployments – To be announced.
Platform Security and IAM: Implementation of security measures for all platform services, including user IAM using decentralized identity protocols, role-based access control (RBAC), and integration with the policy system for fine-grained security actions – To be announced.
Model/Asset Security: End-to-end security for models and assets, along with enhanced security for policy execution – To be announced.

📢 Communications

📧 Email: community@opencyberspace.org
💬 Discord: OpenCyberspace
🐦 X (Twitter): @opencyberspace

🙋‍♀️ Call for Contributors

AIGrid is an open, collaborative project — and we’re actively looking for contributors who resonate with the mission of building open, plural, networked AI infrastructure.

🧠 We Welcome:

Systems thinkers & protocol designers
Help refine the architecture of polycentric networks
Distributed systems engineers
Build and scale the open execution layer
AI/ML developers
Create interoperable cognitive modules and agent topologies
Researchers in ethics, governance, trust, alignment, guardrails, incentives, economics
Design and evolve the policy layers
Writers & communicators
Help document, narrate, and amplify the vision
Hackers, tinkerers, visionaries
If this speaks to you — you’re already one of us

🚀 Whether you want to:

Co-design AI primitives
Propose a new kind of network
Experiment with governance models
Help run a sovereign AIGrid node, cluster, or network

We’d love to hear from you.

🔗 Join the Collective

Join our Discord

📧 Reach Out community@opencyberspace.org

Let’s co-create an open & networked AI future — plural, sovereign, and evolving.