User Guide/Data Management

Data Cleaning

Learn how to clean and prepare your data in GoPie

Clean and prepare your data for analysis with GoPie's built-in data cleaning capabilities.

Overview

Data cleaning is essential for ensuring accurate analysis. GoPie provides several tools to help you identify and fix common data quality issues.

Common Data Issues

Missing Values

  • Identify columns with null values
  • Choose appropriate handling strategies (remove, fill, interpolate)
  • Set default values for specific columns

Data Type Mismatches

  • Automatic type detection
  • Manual type conversion options
  • Handling mixed-type columns

Duplicate Records

  • Detect duplicate rows
  • Remove exact duplicates
  • Handle partial duplicates with custom logic

Cleaning Operations

Basic Operations

  • Remove empty rows and columns
  • Trim whitespace from text fields
  • Standardize date formats
  • Convert units and currencies

Advanced Operations

  • Regular expression transformations
  • Custom validation rules
  • Data normalization
  • Outlier detection and handling

Using the Data Cleaning Interface

  1. Select your dataset from the dashboard
  2. Navigate to the "Data Cleaning" tab
  3. Review the automated data quality report
  4. Apply cleaning operations as needed
  5. Preview changes before applying
  6. Save cleaned dataset as new version

Best Practices

  • Always keep a copy of your original data
  • Document all cleaning operations
  • Validate results after each operation
  • Use version control for dataset changes

Next Steps