User Guide/Data Management
Data Cleaning
Learn how to clean and prepare your data in GoPie
Clean and prepare your data for analysis with GoPie's built-in data cleaning capabilities.
Overview
Data cleaning is essential for ensuring accurate analysis. GoPie provides several tools to help you identify and fix common data quality issues.
Common Data Issues
Missing Values
- Identify columns with null values
- Choose appropriate handling strategies (remove, fill, interpolate)
- Set default values for specific columns
Data Type Mismatches
- Automatic type detection
- Manual type conversion options
- Handling mixed-type columns
Duplicate Records
- Detect duplicate rows
- Remove exact duplicates
- Handle partial duplicates with custom logic
Cleaning Operations
Basic Operations
- Remove empty rows and columns
- Trim whitespace from text fields
- Standardize date formats
- Convert units and currencies
Advanced Operations
- Regular expression transformations
- Custom validation rules
- Data normalization
- Outlier detection and handling
Using the Data Cleaning Interface
- Select your dataset from the dashboard
- Navigate to the "Data Cleaning" tab
- Review the automated data quality report
- Apply cleaning operations as needed
- Preview changes before applying
- Save cleaned dataset as new version
Best Practices
- Always keep a copy of your original data
- Document all cleaning operations
- Validate results after each operation
- Use version control for dataset changes
Next Steps
- Learn about Dataset Configuration for advanced settings
- Explore Natural Language Queries to analyze your cleaned data