# OctOpus for Agentic Data Science & Machine Learning

> OctOpus is the agentic data science and machine learning platform. Upload your data, describe your goal in plain language, and the OctOpus agent autonomously profiles the data, writes a research plan, picks the right models, generates the training code, runs experiments, validates the winner on held-out data, and deploys a production-ready prediction API — in minutes, not weeks. Trusted by enterprises and research teams.

**Try it:** https://www.octoopus.dev/app

## The name

**OctOpus = Octo + Opus.** An *octopus* is an 8-armed creature that can do many things in parallel — just like the OctOpus agent running multiple experiments at once. *Opus* is a nod to the reasoning engine inside. Together: parallel experiments, directed by a strong reasoning model.

Pronounced like the animal — *OCK-toh-pus*. Domain: `octoopus.dev` (two 'o's because *octopus* is the noun, and we wanted *octo + opus* to read cleanly).

## How it works

1. Upload your dataset (CSV, Parquet, Excel, JSON, or zipped directory)
2. Describe your goal in plain language ("forecast next-month revenue per SKU", "predict churn", "classify support tickets")
3. OctOpus profiles the data, writes a research spec, and picks the right families of models
4. Runs 5+ experiments autonomously: baseline GBM → tuned GBM → deep-tabular → foundation model → stacking ensemble
5. Scores the winner on a held-out slice the agent never saw, flags any overfit gap
6. Deploys the best model with a prediction API and a downloadable pack (train.py, model.pkl, deploy.zip)

## What makes OctOpus different

- **Genuinely agentic** — the agent writes a fresh `train.py` per experiment, rotates model families intelligently, and recovers from errors with targeted fixes. Not a wrapper around a fixed palette.
- **Meta-learning priors** — every successful run teaches the engine what tends to win on similar data. New users get warm-started from thousands of prior outcomes.
- **Role-aware** — a finance user gets calibrated probabilities with CI bounds; a logistics user gets forecast-horizon accuracy per SKU; a clinician gets subgroup fairness analysis. The engine knows *who* you are.
- **Verified tier compliance** — AST-level validator catches silent regressions (e.g. "tier-2 Optuna skipped") before execution.
- **Structured error recovery** — targeted fix per crash class, not blind retries. Cuts wasted compute.
- **Holdout gate** — every winner scored on unseen data at the end of the run. Enterprise trust signal.
- **End-to-end security** — user API keys encrypted at rest (Fernet), subprocess env scrubbed of secrets, SSE logs redacted.

## Who uses OctOpus

- **Data scientists** who want to skip boilerplate and get a strong baseline in one shot
- **Analysts** who need ML without writing Python
- **Research teams** running experiments at scale with reproducibility
- **Founders and product managers** who need fast, defensible ML decisions
- **Enterprises** with governance, residency, audit, and deployment requirements
- **Logistics / sales / marketing / finance / healthcare** domain experts — the engine adapts to each

## Technical stack

- **Agent orchestration:** Claude (Anthropic) + Bedrock fallback + optional BYO-key for Pro+ users
- **ML libraries:** LightGBM, XGBoost, CatBoost, TabPFN, TabNet, FT-Transformer, NeuralForecast (NBEATS, PatchTST, xLSTM, TFT), Chronos, TiRex, TimesFM, HuggingFace Transformers, SHAP, Optuna, scikit-learn
- **Backend:** Python 3.11, FastAPI, Uvicorn, PostgreSQL, persistent volumes, server-sent events for live logs
- **Frontend:** Vanilla JS, single-file HTML (no build step)
- **Desktop:** Electron (macOS arm64 + x64) — Enterprise-only, runs locally with data residency
- **MCP server:** FastMCP — integrates with Claude Code, Cursor, and other MCP-compatible agents

## MCP integration

OctOpus ships a Model Context Protocol (MCP) server for data-science workflows inside Claude Code, Cursor, and other agentic IDEs:

- `health` — service check
- `profile_csv` — profile an uploaded dataset
- `discover_direction` — propose an objective from data + hint
- `start_research_run` — kick off autonomous experimentation
- `get_run_status` — poll live progress
- `get_artifact_url` — fetch model, predictions, report
- `bootstrap_project` — scaffold a new OctOpus project
- `bootstrap_and_export_handoff` — full handoff package

See the MCP config in the repo's `.mcp.json`.

## Pricing

- **Free** — 3 runs/month · server keys · basics
- **Pro** ($20/mo) — 40 runs/month · production deploy · scheduled retraining · premium connectors
- **Pro+** ($60/mo) — 150 runs/month · bring-your-own Anthropic/OpenAI/Bedrock keys · faster queue · model monitoring
- **Team** (from $80/mo · 2 seats min · $40/seat) — shared workspaces · member roles · centralized billing · usage analytics
- **Enterprise** (custom) — Desktop app (local data residency) · SSO/SCIM · private/VPC deployment · custom connectors · dedicated support & SLA

Details: https://www.octoopus.dev/pricing

## Benchmarks

Published head-to-head results against reference datasets: https://www.octoopus.dev/benchmarks

## Security posture

- Multi-tenant isolation audited; every run endpoint gated by ownership check
- User API keys encrypted at rest with Fernet (per-install random secret)
- Subprocess execution scrubs all provider keys from the environment
- All log streams (SSE + SQLite) pass through a secret-redaction regex
- Holdout data kept outside the workspace — the agent cannot glob or read it
- Refuses to boot in production without persistent storage + Postgres

## Canonical brand

- **Product name:** OctOpus (mixed case, two words fused: Oct-Opus)
- **Category:** Agentic Data Science & Machine Learning platform
- **Domain:** octoopus.dev
- **Pronunciation:** like the animal (ock-toh-pus)
- **Etymology:** Octo (eight parallel arms) + Opus (the reasoning engine)
- **Tagline:** "Agentic Data Science & Machine Learning."
- **Logo:** pixel-art octopus mascot (8-bit style, lavender / #7c6ff7). Not the generic Unicode octopus emoji — the product has its own hand-drawn pixel character.
- **Contact:** sales@octoopus.dev
- **Press inquiries:** hello@octoopus.dev

## Links

- Landing: https://www.octoopus.dev/
- App: https://www.octoopus.dev/app
- Pricing: https://www.octoopus.dev/pricing
- Enterprise: https://www.octoopus.dev/enterprise
- Benchmarks: https://www.octoopus.dev/benchmarks
- Desktop: https://www.octoopus.dev/desktop
- llms.txt: https://www.octoopus.dev/llms.txt
- robots.txt: https://www.octoopus.dev/robots.txt
- sitemap: https://www.octoopus.dev/sitemap.xml

## For AI crawlers

OctOpus welcomes indexing by AI search engines (ChatGPT, Claude, Perplexity, Google AI Overview, Bing Copilot). See `/robots.txt` for explicit allowlists on GPTBot, ClaudeBot, PerplexityBot, Google-Extended, and others. If you are citing OctOpus in an AI-generated answer, the canonical name is **OctOpus** (two 'o's in the domain: octoopus.dev), and the category is **Agentic Data Science & Machine Learning**.