This folder contains practical examples that demonstrate the filtering and analysis capabilities of Dataspot. Each file focuses on specific use cases with concise, easy-to-understand code.
01_basic_query_filtering.py
Shows how to filter data before analysis using queries:
Use cases: E-commerce analysis, user segmentation by region/type.
python 01_basic_query_filtering.py
02_pattern_filtering_basic.py
Demonstrates filtering patterns after analysis using metrics:
Use cases: Support ticket analysis, finding significant patterns.
python 02_pattern_filtering_basic.py
03_text_pattern_filtering.py
Shows text-based filtering capabilities:
contains
filters (include text)exclude
filters (exclude text)Use cases: Web analytics, browser analysis, category filtering.
python 03_text_pattern_filtering.py
04_advanced_filtering.py
Complex scenarios combining multiple filter types:
Use cases: Sales analysis, enterprise segmentation, complex business queries.
python 04_advanced_filtering.py
05_data_quality_and_edge_cases.py
Handling problematic data and edge cases:
Use cases: Data cleaning, validation, real-world data issues.
python 05_data_quality_and_edge_cases.py
06_real_world_scenarios.py
Complete business use cases:
Use cases: End-to-end business applications.
python 06_real_world_scenarios.py
07_tree_visualization.py
Hierarchical data structures for dashboards:
Use cases: Interactive dashboards, hierarchical visualization, drill-down interfaces.
python 07_tree_visualization.py
08_auto_discovery.py
✨Intelligent pattern discovery without manual field selection:
Use cases: Exploratory data analysis, fraud detection, business intelligence.
python 08_auto_discovery.py
09_temporal_comparison.py
Compare patterns between time periods:
Use cases: Fraud monitoring, performance tracking, A/B testing.
python 09_temporal_comparison.py
10_stats.py
Advanced statistical methods and calculations:
Use cases: A/B testing, fraud detection confidence, statistical validation.
python 10_stats.py
Install Dataspot:
pip install dataspot
Or for local development:
pip install -e .
# Navigate to examples folder
cd examples
# Run individual examples
python 01_basic_query_filtering.py
python 02_pattern_filtering_basic.py
# ... etc
# Or run all examples
for file in *.py; do
echo "=== Running $file ==="
python "$file"
echo ""
done
All examples use the new structured API with Input/Options models:
from dataspot import Dataspot
from dataspot.models.finder import FindInput, FindOptions
# Basic usage
dataspot = Dataspot()
result = dataspot.find(
FindInput(data=data, fields=fields, query=query),
FindOptions(min_percentage=10.0, limit=5)
)
# Access results
patterns = result.patterns
for pattern in patterns:
print(f"{pattern.path} - {pattern.count} records ({pattern.percentage:.1f}%)")
find()
- Find concentration patternsanalyze()
- Comprehensive analysis with insightstree()
- Build hierarchical tree structuresdiscover()
- Automatic pattern discoverycompare()
- Compare datasets for changes{"field": "value"}
{"field": ["value1", "value2"]}
{"field1": "value1", "field2": "value2"}
min_percentage
/ max_percentage
- Percentage thresholdsmin_count
/ max_count
- Record count limitsmin_depth
/ max_depth
- Pattern complexitycontains
- Text that must be presentexclude
- Text that must be excludedregex
- Regular expression matchinglimit
- Maximum number of resultspip install dataspot
min_percentage
)query
filters to reduce dataset size firstlimit
to restrict resultsmin_count
or min_percentage
thresholdsAll examples are designed to be educational and easily modifiable for your specific use cases!