Uploading Data
Learn how to upload and import data files into GoPie
GoPie makes it easy to upload your data files and start analyzing them immediately. This guide covers everything you need to know about uploading data, from supported formats to handling large files.
Supported File Formats
GoPie supports the most common data formats:
CSV (Comma-Separated Values)
- Most common format for tabular data
- Supports custom delimiters (comma, tab, pipe, etc.)
- Automatic encoding detection (UTF-8, Latin-1, etc.)
- Header row detection
name,age,department,salary
John Doe,32,Engineering,95000
Jane Smith,28,Marketing,75000
Excel Files (.xlsx, .xls)
- Supports multiple sheets
- Preserves data types and formatting
- Handles merged cells and formulas
- Date/time format recognition
When uploading Excel files with multiple sheets, each sheet becomes a separate table in your dataset.
Parquet Files
- Columnar storage format
- Excellent compression
- Preserves complex data types
- Ideal for large datasets
Best for:
- Large analytical datasets
- Data with many columns
- Performance-critical applications
JSON Files
- Supports nested structures
- Array of objects format
- Automatic flattening options
- Schema inference
[
{"id": 1, "name": "Product A", "price": 29.99},
{"id": 2, "name": "Product B", "price": 39.99}
]
Upload Methods
Drag and Drop Upload
The easiest way to upload files:
Navigate to Datasets
Click on "Datasets" in the main navigation or within your project.
Drag Your File
Drag your file directly onto the upload area. You'll see a visual indicator when hovering.
Confirm Upload
Review the file details and click "Upload" to proceed.
Click to Browse
Alternatively, click the upload area to open your file browser:
- Click "Upload Dataset" or the upload area
- Select one or more files from your computer
- Review and confirm the upload
Batch Upload
Upload multiple files at once:
You can select multiple files in your file browser using:
- Windows/Linux: Ctrl+Click
- Mac: Cmd+Click
- All platforms: Shift+Click for ranges
When uploading multiple files:
- Each file becomes a separate table in your dataset
- Files are processed in parallel for faster uploads
- Progress is shown for each file individually
File Size Limits and Recommendations
Size Limits
- Free Plan: Up to 100MB per file
- Pro Plan: Up to 1GB per file
- Enterprise Plan: Up to 10GB per file (contact support for larger files)
Performance Recommendations
File Size | Format Recommendation | Upload Time (approx) |
---|---|---|
< 10MB | Any format | < 5 seconds |
10MB - 100MB | CSV or Parquet | 5-30 seconds |
100MB - 1GB | Parquet recommended | 30 seconds - 2 minutes |
> 1GB | Parquet required | 2-10 minutes |
Large File Tips
For files over 100MB:
- Use Parquet Format - Provides 5-10x compression
- Split Large Files - Break into multiple smaller files
- Remove Unnecessary Columns - Upload only needed data
- Use Data Sources - Consider connecting directly to your database
Data Validation
GoPie automatically validates your data during upload:
Automatic Checks
- File Integrity - Ensures file is not corrupted
- Format Validation - Confirms file matches expected format
- Encoding Detection - Handles various text encodings
- Schema Inference - Detects column types automatically
Common Validation Issues
Mixed Data Types
If a column contains mixed types (e.g., numbers and text), GoPie will:
- Attempt to find the most appropriate type
- Convert values where possible
- Mark unconvertible values as null
Example: A column with ["100", "200", "N/A"] becomes [100, 200, null]
Handling Validation Errors
When validation fails, you'll see:
- Error Description - What went wrong
- Affected Rows - Which rows have issues
- Suggested Fixes - How to resolve the problem
Common fixes:
- Remove or fix corrupted rows
- Ensure consistent date formats
- Check for proper CSV delimiters
- Verify file encoding (save as UTF-8)
Upload Progress and Status
During Upload
You'll see real-time progress including:
- Upload percentage
- Transfer speed
- Estimated time remaining
- Current processing stage
Processing Stages
Upload
File transfer to GoPie servers
Validation
Format checking and data validation
Processing
Schema inference and optimization
Indexing
Creating search indexes for fast queries
Ready
Dataset available for querying
Advanced Upload Options
Custom Parsing Options
For CSV files, you can customize:
- Delimiter - Comma, tab, pipe, or custom
- Quote Character - Single, double, or none
- Escape Character - Backslash or custom
- Header Row - First row or specify row number
- Skip Rows - Ignore initial rows
- Encoding - UTF-8, Latin-1, etc.
Data Type Overrides
Override automatic type detection:
-- After upload, you can modify column types
ALTER TABLE my_dataset
ALTER COLUMN price TYPE DECIMAL(10,2);
Compression Support
GoPie automatically handles compressed files:
.gz
- Gzip compression.zip
- ZIP archives (single file).bz2
- Bzip2 compression
Post-Upload Actions
After successful upload:
- Review Schema - Check detected column types
- Add Descriptions - Document your columns
- Set Aliases - Create user-friendly names
- Configure Relationships - Link to other datasets
- Test Queries - Run sample queries to verify
Troubleshooting
Upload Fails Immediately
- Check file size limits
- Ensure file extension matches content
- Verify you have upload permissions
- Try a different browser
Upload Stalls
- Check internet connection
- Try smaller file or split large file
- Use Parquet format for better compression
- Contact support for large uploads
Data Looks Wrong
- Verify CSV delimiter settings
- Check date/time formats
- Review encoding (especially for international characters)
- Ensure numeric formats use proper decimal separators
Best Practices
-
Prepare Your Data
- Remove unnecessary columns before upload
- Ensure consistent formatting
- Use meaningful column names
-
Choose the Right Format
- CSV for simplicity and compatibility
- Parquet for large files and performance
- Excel when preserving formatting matters
-
Document Your Data
- Add dataset descriptions immediately after upload
- Document any data transformations
- Note data sources and update frequency
-
Optimize for Analysis
- Include proper date columns for time-series analysis
- Use consistent units (e.g., all prices in USD)
- Maintain referential integrity for joins
What's Next?
Now that your data is uploaded, explore these topics:
- Connecting External Data Sources - Link to live databases
- Dataset Configuration - Customize your schema
- Natural Language Queries - Start asking questions