This folder contains practical examples that demonstrate the filtering and analysis capabilities of Dataspot. Each file focuses on specific use cases with concise, easy-to-understand code.
01_basic_query_filtering.pyShows how to filter data before analysis using queries:
Use cases: E-commerce analysis, user segmentation by region/type.
python 01_basic_query_filtering.py
02_pattern_filtering_basic.pyDemonstrates filtering patterns after analysis using metrics:
Use cases: Support ticket analysis, finding significant patterns.
python 02_pattern_filtering_basic.py
03_text_pattern_filtering.pyShows text-based filtering capabilities:
contains filters (include text)exclude filters (exclude text)Use cases: Web analytics, browser analysis, category filtering.
python 03_text_pattern_filtering.py
04_advanced_filtering.pyComplex scenarios combining multiple filter types:
Use cases: Sales analysis, enterprise segmentation, complex business queries.
python 04_advanced_filtering.py
05_data_quality_and_edge_cases.pyHandling problematic data and edge cases:
Use cases: Data cleaning, validation, real-world data issues.
python 05_data_quality_and_edge_cases.py
06_real_world_scenarios.pyComplete business use cases:
Use cases: End-to-end business applications.
python 06_real_world_scenarios.py
07_tree_visualization.pyHierarchical data structures for dashboards:
Use cases: Interactive dashboards, hierarchical visualization, drill-down interfaces.
python 07_tree_visualization.py
08_auto_discovery.py ✨Intelligent pattern discovery without manual field selection:
Use cases: Exploratory data analysis, fraud detection, business intelligence.
python 08_auto_discovery.py
09_temporal_comparison.pyCompare patterns between time periods:
Use cases: Fraud monitoring, performance tracking, A/B testing.
python 09_temporal_comparison.py
10_stats.pyAdvanced statistical methods and calculations:
Use cases: A/B testing, fraud detection confidence, statistical validation.
python 10_stats.py
Install Dataspot:
pip install dataspot
Or for local development:
pip install -e .
# Navigate to examples folder
cd examples
# Run individual examples
python 01_basic_query_filtering.py
python 02_pattern_filtering_basic.py
# ... etc
# Or run all examples
for file in *.py; do
echo "=== Running $file ==="
python "$file"
echo ""
done
All examples use the new structured API with Input/Options models:
from dataspot import Dataspot
from dataspot.models.finder import FindInput, FindOptions
# Basic usage
dataspot = Dataspot()
result = dataspot.find(
FindInput(data=data, fields=fields, query=query),
FindOptions(min_percentage=10.0, limit=5)
)
# Access results
patterns = result.patterns
for pattern in patterns:
print(f"{pattern.path} - {pattern.count} records ({pattern.percentage:.1f}%)")
find() - Find concentration patternsanalyze() - Comprehensive analysis with insightstree() - Build hierarchical tree structuresdiscover() - Automatic pattern discoverycompare() - Compare datasets for changes{"field": "value"}{"field": ["value1", "value2"]}{"field1": "value1", "field2": "value2"}min_percentage / max_percentage - Percentage thresholdsmin_count / max_count - Record count limitsmin_depth / max_depth - Pattern complexitycontains - Text that must be presentexclude - Text that must be excludedregex - Regular expression matchinglimit - Maximum number of resultspip install dataspot
min_percentage)query filters to reduce dataset size firstlimit to restrict resultsmin_count or min_percentage thresholdsAll examples are designed to be educational and easily modifiable for your specific use cases!