Industrial-Grade
Model Training.
Train, evaluate, and deploy high-performance ML models on your own infrastructure with a seamless workflow.
Training Pipeline
Data Staging
Ingest and version raw data with MinIO + Parquet chunking.
Chunking
Automatically partition data into optimized training batches.
Training
Run distributed training with live loss monitoring.
Registry
Promote models to a versioned, production-ready registry.
170+
Models Created
218
Training Runs
27
Deployed
88.2%
Avg Accuracy
Two Ways to Build Your Model
Choose the path that fits your team's technical expertise.
Guided ML
Perfect for business analysts and domain experts. Build high-accuracy models using our automated machine learning (AutoML) pipeline. No background in data science or programming needed.
Model Type
Choose from pre-configured business templates
Dataset
Point-and-click data selection from any source
Configure
Simple target selection and feature toggling
Review & Deploy
One-click deployment to production endpoints
Custom Model
For data scientists who want full control. Upload your own Python code or link a Git repository. RiverGen handles the infrastructure, scaling, and versioning for you.
Model Architecture
Support for PyTorch, TensorFlow, Scikit-Learn
Git Integration
Directly pull and sync from GitHub or GitLab
Hyperparameters
Fine-tune with custom training arguments
Custom Configure
Define entry points and environment setups
Review & Deploy
Scale across GPU-accelerated clusters
From Data to Deployed Model in 4 Steps
The intuitive interface that makes machine learning accessible.
Choose Your Foundation
RiverGen provides pre-trained architectures optimized for common business problems. Each template comes with built-in best practices for data processing and evaluation.
Classification
Predict categories or binary outcomes from structured data.
Common Use Cases
Regression
Predict precise numerical values and ranges.
Common Use Cases
Time Series
Identify patterns and forecast future temporal values.
Common Use Cases
Clustering
Discover hidden segments and patterns in unlabeled data.
Common Use Cases
Custom Model — Bring Your Own Code
Total flexibility for research teams and ML engineers. Integrate your existing codebases, research notebooks, and custom training logic into our industrial-grade orchestration layer.
Select Custom Model Template
Unlock code-level controls by choosing the custom container template. Perfect for PyTorch, TensorFlow, or JAX research projects that require unique environment configurations.
Code Integration & Git Sync
Directly sync from private GitHub or GitLab repositories. We support multi-file projects and automatically detect your dependencies from requirements.txt, conda.yml, or pyproject.toml.
Drag source or .zip here
Max size 100MB · Automatic conda.yml detection
Secure Data Reference
reference your Data Sources directly in code. Model Studio handles secure ingestion, credential management, and high-speed delivery to your training nodes.
Advanced Configuration
Define custom hyperparameters, entry points, and CLI arguments. Our orchestrator injects these directly into your training environment.
Hyperparameters
Entry Script
python -m research.train --cfg=conf/dist.yamlHigh-Compute Provisioning
Select from standard or high-memory GPU instances. Launch distributed training runs across NVIDIA Tesla T4 or A100 clusters with one click.
from rivergen import ModelStudio
import torch
import torch.nn as nn
class TransformerModel(nn.Module):
def __init__(self, d_model=128):
super().__init__()
self.encoder = nn.TransformerEncoder(
nn.TransformerEncoderLayer(d_model, nhead=8),
num_layers=6
)
self.fc = nn.Linear(d_model, 1)
def forward(self, x):
return self.fc(self.encoder(x))
# Establish secure connection
studio = ModelStudio.connect()
# Launch distributed training
studio.train(
model=TransformerModel(),
data="financial_ops_prod",
config="configs/prod_v2.yaml",
gpu=True,
nodes=4
)Runtime Compatibility
How Your Data Gets There
Secure, automated data ingestion — no pipeline code required. We bridge the gap between your raw operational data and high-performance training environments.
Source Systems
Code, NoCode, or APIs
MSA Ingestion
Validation & Schema Sync
MinIO Storage
Immutable Parquet Chunks
Training Compute
Distributed GPU Clusters
Automated MSA Batching
The Model Studio Agent (MSA) automatically handles the heavy lifting of data preparation. It partitions massive datasets into optimized Parquet format, ensuring sub-millisecond retrieval during training epochs while maintaining strict schema consistency across every run.
Secure Immutable Staging
Every byte of your data is staged in a private, encrypted MinIO environment. We use immutable snapshots to ensure that model results are 100% reproducible. Your raw data never leaves your infrastructure perimeter, maintaining total compliance and security.
Train on Your Schedule
Model Studio doesn't just train once. It becomes a living part of your data stack, ensuring your intelligence layer evolves as fast as your business.
Scheduled Retraining
Maintain model freshness by setting cron-based schedules. Automatically retrain nightly or weekly to capture the latest market shifts and user behaviors without manual intervention.
Event-Triggered Jobs
Trigger training runs based on data thresholds or system events. If your dataset grows by 10% or a new batch of labeled data arrives, Model Studio starts work immediately.
Intelligent Monitoring
Stay informed with context-rich alerts. Get notified via Slack, Email, or Webhooks about convergence success, accuracy improvements, or infrastructure health.
Centralized Model Registry
Your organization's institutional knowledge, versioned and protected. Every model is a documented asset ready for deployment.
Zero-Downtime Hot-Swapping
Deploy new versions to production APIs without dropping a single request.
Immutable Model Snapshots
Every version is permanently archived with its weights, code, and data links.
Native A/B Testing
Route traffic between versions to validate real-world performance improvements.
customer_churn
Architecture
XGBoost
Metric (Acc)
94.2%
revenue_forecast
Architecture
LSTM
Metric (Acc)
72.0%
fraud_detect
Architecture
Transformer
Metric (Acc)
98.8%
inventory_opt
Architecture
Prophet
Metric (Acc)
--
| Identifier | Architecture | Status | Metric (Acc) |
|---|---|---|---|
customer_churn v3.2.0•Trained on 1.2M rows | XGBoost | Production | 94.2% |
revenue_forecast v1.5.1•Resource timeout | LSTM | Retrying | 72.0% |
fraud_detect v4.0.0•Candidate for prod | Transformer | Staging | 98.8% |
inventory_opt v2.1.0•Configuring features | Prophet | Draft | -- |
Showing 4 of 128 registered models
What is Model Studio?
The brains powering every prompt and answer in River Gen.
Model Studio manages which intelligence settings are active, how they are being used, and where they are connected in your workflows. While the underlying technology can be complex, Model Studio keeps this complexity hidden from everyday users and surfaces only simple, understandable controls.
For Non-Technical Teams
Roll out improvements once. Every team benefits automatically.
When you run a prompt, it uses an approved and well-tuned intelligence layer. Changes can be rolled out centrally so prompts across the organization benefit from improvements — without each user having to change anything. Over time, Model Studio keeps answers accurate, consistent, and aligned with business needs.
Live Training Monitor
Eliminate the "black box" of AI training. RiverGen provides deep visibility into your model's learning process, allowing teams to optimize resources and halt failing runs instantly.
Current Loss
0.082
Last epoch
Best Loss
0.079
Epoch 7
ETA
~12m
Remaining
Industrial Orchestration
From simple linear models to complex deep learning architectures, our SDK handles the plumbing. Launch, scale, and monitor jobs with just a few lines of code.
from rivergen import ModelStudio
# Initialize session
studio = ModelStudio.connect()
# Create managed training job
job = studio.train(
model="./models/transformer_v2",
data_source="prod_postgres",
compute="gpu_standard",
config={
"epochs": 20,
"batch_size": 64,
"optimizer": "adam"
}
)
# Stream live telemetry
job.monitor(live=True)