Snapshot Lifecycle Management (SLM)
🏗️ Phase 1: AWS Infrastructure (S3 & IAM)
1. Create the S3 Bucket
- Log in to the AWS S3 Console.
- Click Create bucket.
- Bucket name:
aws-s3-snap-backup(or your preferred name). - Region: Choose the same region where your EC2/Elasticsearch nodes are running (to save on data transfer costs).
- Leave other settings as default and click Create bucket.
2. Create the IAM Policy
- Go to the IAM Console > Policies > Create policy.
- Switch to the JSON tab and paste the following (replace
aws-s3-snap-backupwith your bucket name):
- Name it
ElasticsearchS3BackupPolicyand save.
3. Create & Attach the IAM Role
- In the IAM Console, go to Roles > Create role.
- Select AWS Service and choose EC2.
- Attach the
ElasticsearchS3BackupPolicyyou just created. - Name the role
Elasticsearch-S3-Role. - Attach to EC2: Go to your EC2 instance list > Select your Elasticsearch nodes > Actions > Security > Modify IAM Role. Select
Elasticsearch-S3-Roleand save.
🛠️ Phase 2: Elasticsearch Node Configuration
[!IMPORTANT] You must perform these steps on EVERY node in your cluster that has themasterordatarole.
1. Install the S3 Plugin
Run this from your terminal on each node:2. Configure the Keystore (Optional but Recommended)
If you aren’t using the IAM Role, you’d add your AWS keys here. Since we are using an IAM Role, Elasticsearch will automatically find the credentials. You can skip this unless your security team requires manual keys.🚀 Phase 3: Elasticsearch API Setup
1. Register the Repository
Now, tell Elasticsearch where to find the bucket. Run this in Kibana Dev Tools:2. Create the SLM Policy (500-Day Retention)
This automates the daily backup at 11:30 PM.⏪ Phase 4: How to Restore (The “Emergency” Guide)
If ILM deletes a log and you need it back, follow these steps:1. Find the Snapshot Name
2. Restore a Specific Index
If you want to restoredemo-app-logs-2026.02.10 from the snapshot:
restored- prefix prevents conflicts with existing indices.
✅ Summary of the Whole System
- Fluent Bit: Sends logs daily (e.g.,
demo-app-logs-2026.03.05). - ILM: Deletes indices from the disk after 15 days.
- SLM: Copies indices to S3 every night and keeps them for 500 days.
- Storage: You save money by keeping only 15 days on fast disks while keeping the rest on cheap S3 storage.
