System Requirements

This guide outlines the hardware, software, and network requirements for running GoPie in various environments, from local development to large-scale production deployments.

Quick Reference

Single Developer / Testing

CPU: 4 cores
RAM: 8 GB
Storage: 20 GB SSD
Users: 1-5
Datasets: < 10 GB

Small Team / Department

CPU: 8 cores
RAM: 16 GB
Storage: 100 GB SSD
Users: 5-50
Datasets: < 100 GB

Standard Production

CPU: 16 cores
RAM: 32 GB
Storage: 500 GB SSD
Users: 50-500
Datasets: < 1 TB

Large Enterprise

CPU: 32+ cores
RAM: 64+ GB
Storage: 2+ TB SSD
Users: 500+
Datasets: Multiple TB

Hardware Requirements

CPU Requirements

GoPie's performance scales with CPU cores, especially for:

Concurrent query processing
AI model inference
Data transformation operations

Component	Minimum	Recommended	Notes
Web Frontend	1 core	2 cores	Node.js based, lightweight
Go Backend	2 cores	4 cores	Handles API requests, file processing
AI Service	2 cores	4 cores	CPU intensive for model inference
PostgreSQL	1 core	2 cores	Metadata operations
DuckDB	2 cores	4+ cores	Benefits from parallelization
Qdrant	1 core	2 cores	Vector search operations

Performance Tip: DuckDB can utilize all available CPU cores for analytical queries. More cores = faster query execution.

Memory Requirements

Memory allocation recommendations by service:

Component	Minimum	Recommended	Large Scale
Web Frontend	512 MB	1 GB	2 GB
Go Backend	1 GB	2 GB	4 GB
AI Service	2 GB	4 GB	8 GB
PostgreSQL	1 GB	2 GB	4 GB
DuckDB	2 GB	4 GB	16+ GB
Qdrant	1 GB	2 GB	4 GB
MinIO	512 MB	1 GB	2 GB

Total System Memory:

Minimum: 8 GB
Recommended: 16 GB
Large Scale: 32+ GB

Storage Requirements

Storage needs depend on your data volume:

Type	Minimum	Recommended	Considerations
System & Apps	10 GB	20 GB	Docker images, application code
PostgreSQL	5 GB	20 GB	Grows with metadata
DuckDB Files	10 GB	50 GB	~50% of raw data size
MinIO Storage	Variable	2x data size	Original files + processed
Qdrant Vectors	5 GB	20 GB	Depends on dataset count
Logs & Temp	5 GB	10 GB	Rotation recommended

Storage Type Matters: Use SSD storage for optimal performance. HDD storage will significantly impact query performance and file upload speeds.

Network Requirements

Requirement	Specification	Notes
Bandwidth	100 Mbps minimum	1 Gbps for large file uploads
Latency	< 100ms to AI providers	Affects natural language processing
Ports	See Port Requirements	Configurable
DNS	Required	For external service connections

Software Requirements

Operating Systems

Recommended for Production

Supported distributions:

Ubuntu 20.04 LTS, 22.04 LTS
Debian 11, 12
RHEL 8, 9
Amazon Linux 2, 2023
Alpine Linux 3.16+ (for containers)

Requirements:

Kernel 4.15+
systemd or Docker
glibc 2.17+

Development and Testing

Supported versions:

macOS 12 (Monterey) or later
Apple Silicon (M1/M2) fully supported
Intel Macs supported

Requirements:

Docker Desktop 4.0+
Xcode Command Line Tools

Development Only

Supported versions:

Windows 10 version 2004+
Windows 11
Windows Server 2019+

Requirements:

WSL2 enabled
Docker Desktop 4.0+
Windows Terminal (recommended)

Cloud Platforms

Tested on:

AWS (EC2, ECS, EKS)
Google Cloud (GCE, GKE)
Azure (VMs, AKS)
DigitalOcean
Linode/Akamai

Any platform supporting Docker/Kubernetes

Required Software

For Docker Deployment

Software	Minimum Version	Recommended	Installation Guide
Docker	20.10	Latest stable	Install Docker
Docker Compose	2.0	2.20+	Install Compose
Git	2.25	Latest	Install Git

For Kubernetes Deployment

Software	Minimum Version	Recommended	Notes
Kubernetes	1.24	1.28+	Any certified distribution
kubectl	1.24	Match cluster	Install kubectl
Helm	3.8	3.13+	Install Helm

For Local Development

Software	Minimum Version	Recommended	Purpose
Node.js	18.0	20 LTS	Frontend development
Bun	1.0	Latest	Fast package manager
Go	1.20	1.21+	Backend development
Python	3.10	3.11+	AI service
Make	3.82	4.0+	Build automation

Database Requirements

GoPie uses multiple databases for different purposes:

Database	Version	Purpose	Notes
PostgreSQL	14+	Metadata storage	15+ recommended
DuckDB	0.9+	Analytical queries	Embedded, no separate install
Qdrant	1.7+	Vector search	1.8+ recommended

AI Provider Requirements

Supported Providers

OpenAI API

Requirements:

Valid API key
Supported models: GPT-4, GPT-4 Turbo, GPT-3.5 Turbo
Account with sufficient credits
Rate limits: 10,000 TPM minimum recommended

Network:

Outbound HTTPS to api.openai.com
Stable internet connection
Low latency preferred (< 100ms)

Anthropic Claude

Requirements:

Valid API key
Supported models: Claude 3 Opus, Sonnet, Haiku
Account with API access
Rate limits based on tier

Network:

Outbound HTTPS to api.anthropic.com
Stable internet connection

Azure OpenAI Service

Requirements:

Azure subscription
Azure OpenAI resource deployed
Deployment names for models
Sufficient quota allocated

Network:

Outbound HTTPS to your Azure endpoint
Azure Virtual Network (optional)

Self-Hosted Models

Requirements:

Ollama or compatible server
Sufficient GPU/CPU for model
16GB+ RAM for 7B models
32GB+ RAM for 13B models

Supported frameworks:

Ollama
LocalAI
Text Generation WebUI
Any OpenAI-compatible API

Cost Consideration: OpenAI GPT-4 provides the best results but costs more. GPT-3.5 Turbo offers good performance at lower cost. Local models eliminate API costs but require powerful hardware.

Port Requirements

Default ports used by GoPie services:

Service	Default Port	Protocol	Configurable	Purpose
Web UI	3000	HTTP/WS	Yes	User interface
API Server	8000	HTTP	Yes	REST API
Chat Server	8001	HTTP	Yes	AI processing
PostgreSQL	5432	TCP	Yes	Database
Qdrant	6333	HTTP	Yes	Vector search
Qdrant gRPC	6334	gRPC	Yes	Internal communication
MinIO	9000	HTTP	Yes	Object storage
MinIO Console	9001	HTTP	Yes	Management UI

Performance Considerations

Dataset Size Guidelines

Performance characteristics by dataset size:

Dataset Size	Query Performance	Import Time	Memory Usage
< 1 GB	< 1 second	< 1 minute	2 GB
1-10 GB	1-5 seconds	1-10 minutes	4-8 GB
10-100 GB	5-30 seconds	10-60 minutes	8-32 GB
> 100 GB	Varies	Hours	32+ GB

Concurrent Users

Resource scaling for concurrent users:

Users	CPU Cores	RAM	Considerations
1-10	4	8 GB	Single instance sufficient
10-50	8	16 GB	Consider read replicas
50-200	16	32 GB	Load balancing recommended
200+	32+	64+ GB	Horizontal scaling required

Query Complexity

Resource usage by query type:

Query Type	CPU Usage	Memory Usage	Example
Simple SELECT	Low	Low	"Show first 10 rows"
Aggregations	Medium	Medium	"Sum by category"
Complex JOINs	High	High	"Cross-dataset analysis"
ML Predictions	Very High	Very High	"Predict next month"

Security Requirements

Network Security

Firewall: Allow only required ports
TLS/SSL: Required for production
VPN: Recommended for admin access
Network Isolation: Separate database tier

System Security

OS Updates: Regular security patches
Docker Security: Non-root containers
File Permissions: Restrictive access
Secrets Management: External vault recommended

Compliance Considerations

For regulated environments:

Data Encryption: At rest and in transit
Audit Logging: All data access logged
Access Controls: Role-based permissions
Data Residency: Deploy in compliant regions

Monitoring Requirements

Essential Monitoring

Metric	Tool	Threshold
CPU Usage	Prometheus	< 80%
Memory Usage	Prometheus	< 85%
Disk Space	Node Exporter	> 20% free
Response Time	APM	< 2 seconds
Error Rate	Sentry	< 1%

Recommended Stack

Metrics: Prometheus + Grafana
Logs: ELK Stack or Loki
Traces: Jaeger or Tempo
Errors: Sentry
Uptime: Uptime Kuma

Backup Requirements

Storage for Backups

Component	Backup Size	Frequency	Retention
PostgreSQL	~1% of data	Daily	30 days
MinIO/S3	100% of data	Weekly	90 days
Qdrant	~5% of data	Daily	7 days
Configuration	< 1 MB	On change	Forever

Backup Storage Options

Local: 2x primary storage
Cloud: S3, GCS, Azure Blob
Network: NFS, SMB shares
Tape: For long-term archives

Scaling Guidelines

Vertical Scaling (Scale Up)

When to scale up:

CPU consistently > 80%
Memory usage > 85%
Query times increasing
Single-node simplicity preferred

Horizontal Scaling (Scale Out)

When to scale out:

Need high availability
Geographic distribution
200 concurrent users
Dataset > 1TB

Service-Specific Scaling

Service	Scale Method	Limit	Notes
Web Frontend	Horizontal	Unlimited	Stateless
API Server	Horizontal	Unlimited	Stateless
Chat Server	Horizontal	Unlimited	Session affinity needed
PostgreSQL	Read replicas	5 replicas	Write scaling limited
DuckDB	Vertical	Single node	Per-dataset isolation
Qdrant	Horizontal	100 nodes	Distributed mode

Cloud Provider Specifics

AWS Recommendations

EC2: m6i.2xlarge or larger
EBS: gp3 with 10,000 IOPS
RDS: db.r6g.xlarge for PostgreSQL
S3: For object storage
EKS: For Kubernetes deployment

Google Cloud

GCE: n2-standard-8 or larger
Persistent Disk: SSD with 30,000 IOPS
Cloud SQL: db-n1-standard-4
GCS: For object storage
GKE: For Kubernetes deployment

Azure

VMs: Standard_D8s_v5 or larger
Managed Disks: Premium SSD
Database: GP_Gen5_4 for PostgreSQL
Blob Storage: For objects
AKS: For Kubernetes deployment

Pre-Installation Checklist

Before installing GoPie, verify:

Ready to Install? Once you've verified these requirements, proceed to our Quick Start Guide or choose your preferred installation method.

Quick Reference

Hardware Requirements

CPU Requirements

Memory Requirements

Storage Requirements

Network Requirements

Software Requirements

Operating Systems

Required Software

For Docker Deployment

For Kubernetes Deployment

For Local Development

Database Requirements

AI Provider Requirements

Supported Providers

Port Requirements

Performance Considerations

Dataset Size Guidelines

Concurrent Users

Query Complexity

Security Requirements

Network Security

System Security

Compliance Considerations

Monitoring Requirements

Essential Monitoring

Recommended Stack

Backup Requirements

Storage for Backups

Backup Storage Options

Scaling Guidelines

Vertical Scaling (Scale Up)

Horizontal Scaling (Scale Out)

Service-Specific Scaling

Cloud Provider Specifics

AWS Recommendations

Google Cloud

Azure

Pre-Installation Checklist

System Requirements

On this page