User Guide/Querying Data

Multi-Dataset Queries

Query and join data across multiple datasets in GoPie

Combine and analyze data from multiple datasets to gain deeper insights with GoPie's multi-dataset query capabilities.

Overview

Multi-dataset queries allow you to:

  • Join data from different sources
  • Perform cross-dataset analysis
  • Create unified views of your data
  • Build comprehensive reports

Joining Datasets

Natural Language Joins

Simply describe how datasets relate:

  • "Show me sales data with customer information"
  • "Combine inventory levels with supplier details"
  • "Match user activities with their profile data"

Join Types

Inner Joins

Match records that exist in both datasets:

-- Example: Customers who made purchases
SELECT * FROM customers 
INNER JOIN orders ON customers.id = orders.customer_id

Left Joins

Include all records from the first dataset:

-- Example: All customers, with or without orders
SELECT * FROM customers 
LEFT JOIN orders ON customers.id = orders.customer_id

Full Outer Joins

Include all records from both datasets:

-- Example: All products and all categories
SELECT * FROM products 
FULL OUTER JOIN categories ON products.category_id = categories.id

Cross-Dataset Analysis

Aggregations Across Datasets

  • Sum values from multiple sources
  • Calculate averages across datasets
  • Find correlations between datasets
  • Compare metrics side-by-side

Common Patterns

Customer 360 View

Combine data from:

  • Transaction history
  • Support tickets
  • User profiles
  • Activity logs

Financial Consolidation

Merge data from:

  • Revenue streams
  • Expense categories
  • Budget allocations
  • Forecast models

Performance Optimization

Best Practices

  1. Index Join Columns: Ensure join fields are properly indexed
  2. Filter Early: Apply WHERE clauses before joins
  3. Limit Results: Use LIMIT during development
  4. Cache Results: Save frequently used joins

Query Planning

  • Preview execution plans
  • Identify performance bottlenecks
  • Optimize join order
  • Use materialized views for complex queries

Data Consistency

Handling Mismatches

  • Different date formats
  • Varying data types
  • Missing join keys
  • Duplicate records

Data Quality Checks

  • Validate join conditions
  • Check for orphaned records
  • Verify data completeness
  • Monitor join cardinality

Advanced Features

Dynamic Joins

  • Conditional joins based on data values
  • Time-based joins for temporal data
  • Fuzzy matching for inexact joins
  • Geographic joins for spatial data

Virtual Datasets

Create reusable views that:

  • Combine multiple datasets
  • Apply standard transformations
  • Enforce business rules
  • Simplify complex queries

Security Considerations

Access Control

  • Respect individual dataset permissions
  • Row-level security across joins
  • Column-level restrictions
  • Audit trail for multi-dataset access

Data Governance

  • Track data lineage
  • Document relationships
  • Monitor usage patterns
  • Ensure compliance

Examples

Sales Analysis

"Show me total revenue by product category, 
including product details and customer demographics"

Inventory Management

"Find products with low stock levels and 
their supplier contact information"

Customer Insights

"Identify high-value customers who haven't 
made a purchase in the last 30 days"

Next Steps