AWS Cost & Operationsby zxkane

This skill provides AWS cost optimization, monitoring, and operational best practices with integrated MCP servers for billing analysis, cost estimation, observability, and security assessment.

AWS Cost & Operations

This skill provides comprehensive guidance for AWS cost optimization, monitoring, observability, and operational excellence with integrated MCP servers.

Integrated MCP Servers

This skill includes 8 MCP servers automatically configured with the plugin:

Cost Management Servers

1. AWS Billing and Cost Management MCP Server

Purpose: Real-time billing and cost management

  • View current AWS spending and trends
  • Analyze billing details across services
  • Track budget utilization
  • Monitor cost allocation tags
  • Review consolidated billing for organizations

2. AWS Pricing MCP Server

Purpose: Pre-deployment cost estimation and optimization

  • Estimate costs before deploying resources
  • Compare pricing across regions
  • Calculate Total Cost of Ownership (TCO)
  • Evaluate different service options for cost efficiency
  • Get current pricing information for AWS services

3. AWS Cost Explorer MCP Server

Purpose: Detailed cost analysis and reporting

  • Analyze historical spending patterns
  • Create custom cost reports
  • Identify cost anomalies and trends
  • Forecast future costs
  • Analyze cost by service, region, or tag
  • Generate cost optimization recommendations

Monitoring & Observability Servers

4. Amazon CloudWatch MCP Server

Purpose: Metrics, alarms, and logs analysis

  • Query CloudWatch metrics and logs
  • Create and manage CloudWatch alarms
  • Analyze application performance metrics
  • Troubleshoot operational issues
  • Set up custom dashboards
  • Monitor resource utilization

5. Amazon CloudWatch Application Signals MCP Server

Purpose: Application monitoring and performance insights

  • Monitor application health and performance
  • Analyze service-level objectives (SLOs)
  • Track application dependencies
  • Identify performance bottlenecks
  • Monitor service map and traces

6. AWS Managed Prometheus MCP Server

Purpose: Prometheus-compatible monitoring

  • Query Prometheus metrics
  • Monitor containerized applications
  • Analyze Kubernetes workload metrics
  • Create PromQL queries
  • Track custom application metrics

Audit & Security Servers

7. AWS CloudTrail MCP Server

Purpose: AWS API activity and audit analysis

  • Analyze AWS API calls and user activity
  • Track resource changes and modifications
  • Investigate security incidents
  • Audit compliance requirements
  • Identify unusual access patterns
  • Review who made what changes when

8. AWS Well-Architected Security Assessment Tool MCP Server

Purpose: Security assessment against Well-Architected Framework

  • Assess security posture against AWS best practices
  • Identify security gaps and vulnerabilities
  • Get security improvement recommendations
  • Review security pillar compliance
  • Generate security assessment reports

When to Use This Skill

Use this skill when:

  • Optimizing AWS costs and reducing spending
  • Estimating costs before deployment
  • Monitoring application and infrastructure performance
  • Setting up observability and alerting
  • Analyzing spending patterns and trends
  • Investigating operational issues
  • Auditing AWS activity and changes
  • Assessing security posture
  • Implementing operational excellence

Cost Optimization Best Practices

Pre-Deployment Cost Estimation

Always estimate costs before deploying:

  1. Use AWS Pricing MCP to estimate resource costs
  2. Compare pricing across different regions
  3. Evaluate alternative service options
  4. Calculate expected monthly costs
  5. Plan for scaling and growth

Example workflow:

"Estimate the monthly cost of running a Lambda function with
1 million invocations, 512MB memory, 3-second duration in us-east-1"

Cost Analysis and Optimization

Regular cost reviews:

  1. Use Cost Explorer MCP to analyze spending trends
  2. Identify cost anomalies and unexpected charges
  3. Review costs by service, region, and environment
  4. Compare actual vs. budgeted costs
  5. Generate cost optimization recommendations

Cost optimization strategies:

  • Right-size over-provisioned resources
  • Use appropriate storage classes (S3, EBS)
  • Implement auto-scaling for dynamic workloads
  • Leverage Savings Plans and Reserved Instances
  • Delete unused resources and snapshots
  • Use cost allocation tags effectively

Budget Monitoring

Track spending against budgets:

  1. Use Billing and Cost Management MCP to monitor budgets
  2. Set up budget alerts for threshold breaches
  3. Review budget utilization regularly
  4. Adjust budgets based on trends
  5. Implement cost controls and governance

Monitoring and Observability Best Practices

CloudWatch Metrics and Alarms

Implement comprehensive monitoring:

  1. Use CloudWatch MCP to query metrics and logs
  2. Set up alarms for critical metrics:
    • CPU and memory utilization
    • Error rates and latency
    • Queue depths and processing times
    • API gateway throttling
    • Lambda errors and timeouts
  3. Create CloudWatch dashboards for visualization
  4. Use log insights for troubleshooting

Example alarm scenarios:

  • Lambda error rate > 1%
  • EC2 CPU utilization > 80%
  • API Gateway 4xx/5xx error spike
  • DynamoDB throttled requests
  • ECS task failures

Application Performance Monitoring

Monitor application health:

  1. Use CloudWatch Application Signals MCP for APM
  2. Track service-level objectives (SLOs)
  3. Monitor application dependencies
  4. Identify performance bottlenecks
  5. Set up distributed tracing

Container and Kubernetes Monitoring

For containerized workloads:

  1. Use AWS Managed Prometheus MCP for metrics
  2. Monitor container resource utilization
  3. Track pod and node health
  4. Create PromQL queries for custom metrics
  5. Set up alerts for container anomalies

Audit and Security Best Practices

CloudTrail Activity Analysis

Audit AWS activity:

  1. Use CloudTrail MCP to analyze API activity
  2. Track who made changes to resources
  3. Investigate security incidents
  4. Monitor for suspicious activity patterns
  5. Audit compliance with policies

Common audit scenarios:

  • "Who deleted this S3 bucket?"
  • "Show all IAM role changes in the last 24 hours"
  • "List failed login attempts"
  • "Find all actions by a specific user"
  • "Track modifications to security groups"

Security Assessment

Regular security reviews:

  1. Use Well-Architected Security Assessment MCP
  2. Assess security posture against best practices
  3. Identify security gaps and vulnerabilities
  4. Implement recommended security improvements
  5. Document security compliance

Security assessment areas:

  • Identity and Access Management (IAM)
  • Detective controls and monitoring
  • Infrastructure protection
  • Data protection and encryption
  • Incident response preparedness

Using MCP Servers Effectively

Cost Analysis Workflow

  1. Pre-deployment: Use Pricing MCP to estimate costs
  2. Post-deployment: Use Billing MCP to track actual spending
  3. Analysis: Use Cost Explorer MCP for detailed cost analysis
  4. Optimization: Implement recommendations from Cost Explorer

Monitoring Workflow

  1. Setup: Configure CloudWatch metrics and alarms
  2. Monitor: Use CloudWatch MCP to track key metrics
  3. Analyze: Use Application Signals for APM insights
  4. Troubleshoot: Query CloudWatch Logs for issue resolution

Security Workflow

  1. Audit: Use CloudTrail MCP to review activity
  2. Assess: Use Well-Architected Security Assessment
  3. Remediate: Implement security recommendations
  4. Monitor: Track security events via CloudWatch

MCP Usage Best Practices

  1. Cost Awareness: Check pricing before deploying resources
  2. Proactive Monitoring: Set up alarms for critical metrics
  3. Regular Reviews: Analyze costs and performance weekly
  4. Audit Trails: Review CloudTrail logs for compliance
  5. Security First: Run security assessments regularly
  6. Optimize Continuously: Act on cost and performance recommendations

Operational Excellence Guidelines

Cost Optimization

  • Tag Everything: Use consistent cost allocation tags
  • Review Monthly: Analyze spending trends and anomalies
  • Right-size: Match resources to actual usage
  • Automate: Use auto-scaling and scheduling
  • Monitor Budgets: Set alerts for cost overruns

Monitoring and Alerting

  • Critical Metrics: Alert on business-critical metrics
  • Noise Reduction: Fine-tune thresholds to reduce false positives
  • Actionable Alerts: Ensure alerts have clear remediation steps
  • Dashboard Visibility: Create dashboards for key stakeholders
  • Log Retention: Balance cost and compliance needs

Security and Compliance

  • Least Privilege: Grant minimum required permissions
  • Audit Regularly: Review CloudTrail logs for anomalies
  • Encrypt Data: Use encryption at rest and in transit
  • Assess Continuously: Run security assessments frequently
  • Incident Response: Have procedures for security events

Additional Resources

For detailed operational patterns and best practices, refer to the comprehensive reference:

File: references/operations-patterns.md

This reference includes:

  • Cost optimization strategies
  • Monitoring and alerting patterns
  • Observability best practices
  • Security and compliance guidelines
  • Troubleshooting workflows

CloudWatch Alarms Reference

File: references/cloudwatch-alarms.md

Common alarm configurations for:

  • Lambda functions
  • EC2 instances
  • RDS databases
  • DynamoDB tables
  • API Gateway
  • ECS services
  • Application Load Balancers

Related Skills

Canvas Design
by Anthropic
Create beautiful visual art in .png and .pdf documents using design philosophy. You should use this skill when the user asks to create a poster, piece of art, design, or other static piece. Create original visual designs, never copying existing artists' work to avoid copyright violations. license: Complete terms in LICENSE.txt
Brand Guidelines
by Anthropic
Applies Anthropic's official brand colors and typography to any sort of artifact that may benefit from having Anthropic's look-and-feel. Use it when brand colors or style guidelines, visual formatting, or company design standards apply. license: Complete terms in LICENSE.txt
Domain Name Brainstormer
by composiohq
This skill helps you find the perfect domain name for your project by generating creative options and checking what's actually available to register.
NotebookLM Skill
by pleaseprompto
Let Claude Code chat directly with NotebookLM for source-grounded answers based exclusively on your uploaded documents
Playwright Skill
by lackeyjb
Enables Claude to write and execute any Playwright automation on-the-fly - from simple page tests to complex multi-step flows.
Internal Comms
by Anthropic
A set of resources to help me write all kinds of internal communications, using the formats that my company likes to use. Claude should use this skill whenever asked to write some sort of internal communications (status reports, leadership updates, 3P updates, company newsletters, FAQs, incident reports, project updates, etc.). license: Complete terms in LICENSE.txt
Knowledge Capture
by Notion
Turns discussions into durable knowledge in Notion. Captures insights and decisions from chat, formats them clearly, and files them to the right wiki or database with smart linking.
YouTube Transcript Downloader
by michalparkola
Download YouTube video transcripts when user provides a YouTube URL or asks to download/get/fetch a transcript from YouTube. Also use when user wants to transcribe or get captions/subtitles from a YouTube video.