explore-dataset
Khám phá một tập dữ liệu Axiom để hiểu lược đồ, trường, khối lượng và các mẫu của nó. Sử dụng khi khám phá một tập dữ liệu mới, điều tra cấu trúc dữ liệu, hoặc…
npx skills add https://github.com/axiomhq/cli --skill explore-datasetDataset Exploration
Systematically explore an Axiom dataset to understand its structure, content, and potential use cases.
Arguments
When invoked with a dataset name (e.g., /explore-dataset logs), the name is available as $ARGUMENTS.
Exploration Protocol
1. List Available Datasets
If no dataset specified, list what's available:
axiom dataset list -f json
2. Schema Discovery
Always start here. Discover actual field names and types:
axiom query "['<dataset>'] | getschema" --start-time -1h
Identify:
- Field names and types
- Dotted fields requiring bracket notation
- Timestamp fields
- Key dimensions (service, status, level)
OTel trace data: If schema contains trace_id, span_id, attributes.*, note that:
- Service fields are promoted: use
['service.name']not['resource.service.name'] - Custom attributes:
['attributes.custom']['field']withtostring()for aggregations - See
axiom-aplskill's OTel reference for field mappings
3. Sample Data
Examine actual values:
axiom query "['<dataset>'] | limit 10" --start-time -1h -f json
Look for:
- Data structure and relationships
- Field value formats
- Data quality issues
4. Volume Analysis
Understand data volume patterns:
axiom query "['<dataset>'] | summarize count() by bin(_time, 1h) | sort by _time asc" --start-time -24h
Analyze:
- Event volume over time
- Data freshness
- Collection gaps
5. Categorical Field Analysis
For each key categorical field (status, level, service):
axiom query "['<dataset>'] | summarize count() by <field> | top 20 by count_" --start-time -1h
Identify:
- Value distributions
- Cardinality
- Key dimensions for filtering
6. Numerical Field Statistics
For numeric fields (duration, bytes, count):
axiom query "['<dataset>'] | summarize count(), min(<field>), max(<field>), avg(<field>), percentiles(<field>, 50, 95, 99)" --start-time -1h
7. Error Pattern Detection
Search for error indicators:
axiom query "search in (['<dataset>']) 'error' or 'fail' or 'exception' | limit 20" --start-time -1h
Output Format
Provide a summary including:
## Dataset Summary: <name>
### Purpose
<What system generated this data, what it represents>
### Key Fields
| Field | Type | Description |
|-------|------|-------------|
| ... | ... | ... |
### Volume
- Events per hour: ~X
- Data freshness: last event at X
### Key Dimensions
- `status`: 200, 400, 500, ...
- `service.name`: api, web, worker, ...
### Recommended Queries
<Common queries for this dataset>
### Monitoring Opportunities
<What could be alerted on>
When NOT to Use
- Known datasets: If you already understand the schema, skip exploration and query directly
- Quick field check: Use
getschemadirectly for single field lookups - Production queries: Exploration uses expensive operations (
search); extract patterns then optimize - Repeated analysis: Once explored, document findings and reuse—don't re-explore
APL Reference
For query syntax, invoke the axiom-apl skill which provides comprehensive documentation on operators, functions, and patterns.