System Requirements

Hardware, software, and network requirements for running GoPie in different environments

This guide outlines the hardware, software, and network requirements for running GoPie in various environments, from local development to large-scale production deployments.

Quick Reference

Single Developer / Testing

  • CPU: 4 cores
  • RAM: 8 GB
  • Storage: 20 GB SSD
  • Users: 1-5
  • Datasets: < 10 GB

Small Team / Department

  • CPU: 8 cores
  • RAM: 16 GB
  • Storage: 100 GB SSD
  • Users: 5-50
  • Datasets: < 100 GB

Standard Production

  • CPU: 16 cores
  • RAM: 32 GB
  • Storage: 500 GB SSD
  • Users: 50-500
  • Datasets: < 1 TB

Large Enterprise

  • CPU: 32+ cores
  • RAM: 64+ GB
  • Storage: 2+ TB SSD
  • Users: 500+
  • Datasets: Multiple TB

Hardware Requirements

CPU Requirements

GoPie's performance scales with CPU cores, especially for:

  • Concurrent query processing
  • AI model inference
  • Data transformation operations
ComponentMinimumRecommendedNotes
Web Frontend1 core2 coresNode.js based, lightweight
Go Backend2 cores4 coresHandles API requests, file processing
AI Service2 cores4 coresCPU intensive for model inference
PostgreSQL1 core2 coresMetadata operations
DuckDB2 cores4+ coresBenefits from parallelization
Qdrant1 core2 coresVector search operations

Performance Tip: DuckDB can utilize all available CPU cores for analytical queries. More cores = faster query execution.

Memory Requirements

Memory allocation recommendations by service:

ComponentMinimumRecommendedLarge Scale
Web Frontend512 MB1 GB2 GB
Go Backend1 GB2 GB4 GB
AI Service2 GB4 GB8 GB
PostgreSQL1 GB2 GB4 GB
DuckDB2 GB4 GB16+ GB
Qdrant1 GB2 GB4 GB
MinIO512 MB1 GB2 GB

Total System Memory:

  • Minimum: 8 GB
  • Recommended: 16 GB
  • Large Scale: 32+ GB

Storage Requirements

Storage needs depend on your data volume:

TypeMinimumRecommendedConsiderations
System & Apps10 GB20 GBDocker images, application code
PostgreSQL5 GB20 GBGrows with metadata
DuckDB Files10 GB50 GB~50% of raw data size
MinIO StorageVariable2x data sizeOriginal files + processed
Qdrant Vectors5 GB20 GBDepends on dataset count
Logs & Temp5 GB10 GBRotation recommended

Storage Type Matters: Use SSD storage for optimal performance. HDD storage will significantly impact query performance and file upload speeds.

Network Requirements

RequirementSpecificationNotes
Bandwidth100 Mbps minimum1 Gbps for large file uploads
Latency< 100ms to AI providersAffects natural language processing
PortsSee Port RequirementsConfigurable
DNSRequiredFor external service connections

Software Requirements

Operating Systems

Recommended for Production

Supported distributions:

  • Ubuntu 20.04 LTS, 22.04 LTS
  • Debian 11, 12
  • RHEL 8, 9
  • Amazon Linux 2, 2023
  • Alpine Linux 3.16+ (for containers)

Requirements:

  • Kernel 4.15+
  • systemd or Docker
  • glibc 2.17+

Development and Testing

Supported versions:

  • macOS 12 (Monterey) or later
  • Apple Silicon (M1/M2) fully supported
  • Intel Macs supported

Requirements:

  • Docker Desktop 4.0+
  • Xcode Command Line Tools

Development Only

Supported versions:

  • Windows 10 version 2004+
  • Windows 11
  • Windows Server 2019+

Requirements:

  • WSL2 enabled
  • Docker Desktop 4.0+
  • Windows Terminal (recommended)

Cloud Platforms

Tested on:

  • AWS (EC2, ECS, EKS)
  • Google Cloud (GCE, GKE)
  • Azure (VMs, AKS)
  • DigitalOcean
  • Linode/Akamai

Any platform supporting Docker/Kubernetes

Required Software

For Docker Deployment

SoftwareMinimum VersionRecommendedInstallation Guide
Docker20.10Latest stableInstall Docker
Docker Compose2.02.20+Install Compose
Git2.25LatestInstall Git

For Kubernetes Deployment

SoftwareMinimum VersionRecommendedNotes
Kubernetes1.241.28+Any certified distribution
kubectl1.24Match clusterInstall kubectl
Helm3.83.13+Install Helm

For Local Development

SoftwareMinimum VersionRecommendedPurpose
Node.js18.020 LTSFrontend development
Bun1.0LatestFast package manager
Go1.201.21+Backend development
Python3.103.11+AI service
Make3.824.0+Build automation

Database Requirements

GoPie uses multiple databases for different purposes:

DatabaseVersionPurposeNotes
PostgreSQL14+Metadata storage15+ recommended
DuckDB0.9+Analytical queriesEmbedded, no separate install
Qdrant1.7+Vector search1.8+ recommended

AI Provider Requirements

Supported Providers

OpenAI API

Requirements:

  • Valid API key
  • Supported models: GPT-4, GPT-4 Turbo, GPT-3.5 Turbo
  • Account with sufficient credits
  • Rate limits: 10,000 TPM minimum recommended

Network:

  • Outbound HTTPS to api.openai.com
  • Stable internet connection
  • Low latency preferred (< 100ms)

Anthropic Claude

Requirements:

  • Valid API key
  • Supported models: Claude 3 Opus, Sonnet, Haiku
  • Account with API access
  • Rate limits based on tier

Network:

  • Outbound HTTPS to api.anthropic.com
  • Stable internet connection

Azure OpenAI Service

Requirements:

  • Azure subscription
  • Azure OpenAI resource deployed
  • Deployment names for models
  • Sufficient quota allocated

Network:

  • Outbound HTTPS to your Azure endpoint
  • Azure Virtual Network (optional)

Self-Hosted Models

Requirements:

  • Ollama or compatible server
  • Sufficient GPU/CPU for model
  • 16GB+ RAM for 7B models
  • 32GB+ RAM for 13B models

Supported frameworks:

  • Ollama
  • LocalAI
  • Text Generation WebUI
  • Any OpenAI-compatible API

Cost Consideration: OpenAI GPT-4 provides the best results but costs more. GPT-3.5 Turbo offers good performance at lower cost. Local models eliminate API costs but require powerful hardware.

Port Requirements

Default ports used by GoPie services:

ServiceDefault PortProtocolConfigurablePurpose
Web UI3000HTTP/WSYesUser interface
API Server8000HTTPYesREST API
Chat Server8001HTTPYesAI processing
PostgreSQL5432TCPYesDatabase
Qdrant6333HTTPYesVector search
Qdrant gRPC6334gRPCYesInternal communication
MinIO9000HTTPYesObject storage
MinIO Console9001HTTPYesManagement UI

Performance Considerations

Dataset Size Guidelines

Performance characteristics by dataset size:

Dataset SizeQuery PerformanceImport TimeMemory Usage
< 1 GB< 1 second< 1 minute2 GB
1-10 GB1-5 seconds1-10 minutes4-8 GB
10-100 GB5-30 seconds10-60 minutes8-32 GB
> 100 GBVariesHours32+ GB

Concurrent Users

Resource scaling for concurrent users:

UsersCPU CoresRAMConsiderations
1-1048 GBSingle instance sufficient
10-50816 GBConsider read replicas
50-2001632 GBLoad balancing recommended
200+32+64+ GBHorizontal scaling required

Query Complexity

Resource usage by query type:

Query TypeCPU UsageMemory UsageExample
Simple SELECTLowLow"Show first 10 rows"
AggregationsMediumMedium"Sum by category"
Complex JOINsHighHigh"Cross-dataset analysis"
ML PredictionsVery HighVery High"Predict next month"

Security Requirements

Network Security

  • Firewall: Allow only required ports
  • TLS/SSL: Required for production
  • VPN: Recommended for admin access
  • Network Isolation: Separate database tier

System Security

  • OS Updates: Regular security patches
  • Docker Security: Non-root containers
  • File Permissions: Restrictive access
  • Secrets Management: External vault recommended

Compliance Considerations

For regulated environments:

  • Data Encryption: At rest and in transit
  • Audit Logging: All data access logged
  • Access Controls: Role-based permissions
  • Data Residency: Deploy in compliant regions

Monitoring Requirements

Essential Monitoring

MetricToolThreshold
CPU UsagePrometheus< 80%
Memory UsagePrometheus< 85%
Disk SpaceNode Exporter> 20% free
Response TimeAPM< 2 seconds
Error RateSentry< 1%
  • Metrics: Prometheus + Grafana
  • Logs: ELK Stack or Loki
  • Traces: Jaeger or Tempo
  • Errors: Sentry
  • Uptime: Uptime Kuma

Backup Requirements

Storage for Backups

ComponentBackup SizeFrequencyRetention
PostgreSQL~1% of dataDaily30 days
MinIO/S3100% of dataWeekly90 days
Qdrant~5% of dataDaily7 days
Configuration< 1 MBOn changeForever

Backup Storage Options

  • Local: 2x primary storage
  • Cloud: S3, GCS, Azure Blob
  • Network: NFS, SMB shares
  • Tape: For long-term archives

Scaling Guidelines

Vertical Scaling (Scale Up)

When to scale up:

  • CPU consistently > 80%
  • Memory usage > 85%
  • Query times increasing
  • Single-node simplicity preferred

Horizontal Scaling (Scale Out)

When to scale out:

  • Need high availability
  • Geographic distribution
  • 200 concurrent users

  • Dataset > 1TB

Service-Specific Scaling

ServiceScale MethodLimitNotes
Web FrontendHorizontalUnlimitedStateless
API ServerHorizontalUnlimitedStateless
Chat ServerHorizontalUnlimitedSession affinity needed
PostgreSQLRead replicas5 replicasWrite scaling limited
DuckDBVerticalSingle nodePer-dataset isolation
QdrantHorizontal100 nodesDistributed mode

Cloud Provider Specifics

AWS Recommendations

  • EC2: m6i.2xlarge or larger
  • EBS: gp3 with 10,000 IOPS
  • RDS: db.r6g.xlarge for PostgreSQL
  • S3: For object storage
  • EKS: For Kubernetes deployment

Google Cloud

  • GCE: n2-standard-8 or larger
  • Persistent Disk: SSD with 30,000 IOPS
  • Cloud SQL: db-n1-standard-4
  • GCS: For object storage
  • GKE: For Kubernetes deployment

Azure

  • VMs: Standard_D8s_v5 or larger
  • Managed Disks: Premium SSD
  • Database: GP_Gen5_4 for PostgreSQL
  • Blob Storage: For objects
  • AKS: For Kubernetes deployment

Pre-Installation Checklist

Before installing GoPie, verify:

  • Operating system meets requirements
  • Sufficient CPU, RAM, and storage
  • Required software installed
  • Network connectivity verified
  • Firewall rules configured
  • AI provider credentials ready
  • Backup storage available
  • Monitoring solution planned
  • Security requirements understood
  • Scaling strategy defined

Ready to Install? Once you've verified these requirements, proceed to our Quick Start Guide or choose your preferred installation method.