Step-by-Step Guides
Practical tutorials for common migration scenarios and workflows.
Local Filesystem to S3
Upload a local directory to an S3-compatible bucket. Perfect for initial data seeding or backup workflows.
CLI command:
# Upload local data to S3
godwit sync \
--source ./testdata/ \
--destination s3://my-bucket/backup \
--destination-access-key access_key \
--destination-secret-key secret_key \
--destination-endpoint localhost:9001 \
--destination-secure=false \
--state-path ./tmp/state.db \
--logs-dir ./tmp/logs \
--uiConfig file equivalent:
# local-to-s3.yml
source:
url: ./testdata/
destination:
url: s3://my-bucket/backup
access_key: access_key
secret_key: secret_key
endpoint: localhost:9001
secure: false
run:
state_path: ./tmp/state.db
output:
logs_dir: ./tmp/logs
ui: trueS3 to Local Filesystem
Download objects from an S3 bucket to your local filesystem. Useful for creating local backups or restoring data.
CLI command:
# Pull bucket contents to local disk
godwit sync \
--source s3://my-bucket/data \
--source-access-key access_key \
--source-secret-key secret_key \
--source-endpoint localhost:9001 \
--source-secure=false \
--destination ./downloads/ \
--state-path ./tmp/state.db \
--logs-dir ./tmp/logs \
--uiConfig file equivalent:
# s3-to-local.yml
source:
url: s3://my-bucket/data
access_key: access_key
secret_key: secret_key
endpoint: localhost:9001
secure: false
destination:
url: ./downloads/
run:
state_path: ./tmp/state.db
output:
logs_dir: ./tmp/logs
ui: trueS3 to S3 Migration
Copy data between S3-compatible endpoints. Transfers are performed fully on-the-fly without local storage - data streams directly from source to destination.
CLI command:
# Copy between S3-compatible endpoints
godwit sync \
--source s3://source-bucket/data \
--source-endpoint source.storage.example.com \
--source-region us-east-1 \
--source-access-key SOURCE_ACCESS_KEY \
--source-secret-key SOURCE_SECRET_KEY \
--source-secure=true \
--destination s3://dest-bucket/backup \
--destination-endpoint localhost:9000 \
--destination-access-key access_key \
--destination-secret-key secret_key \
--destination-secure=false \
--state-path ./tmp/state.db \
--logs-dir ./tmp/logsConfig file equivalent:
# s3-to-s3.yml
source:
url: s3://source-bucket/data
endpoint: source.storage.example.com
region: us-east-1
access_key: SOURCE_ACCESS_KEY
secret_key: SOURCE_SECRET_KEY
secure: true
destination:
url: s3://dest-bucket/backup
endpoint: localhost:9000
access_key: access_key
secret_key: secret_key
secure: false
run:
state_path: ./tmp/state.db
output:
logs_dir: ./tmp/logsCross-Cluster S3 Transfer
Copy between two S3-compatible endpoints with filtering and worker limits:
CLI command:
# Copy between S3-compatible endpoints
# Skip checksum files and limit to 2 workers
godwit sync \
--source s3://source-bucket \
--source-endpoint localhost:9001 \
--source-access-key access_key \
--source-secret-key secret_key \
--source-secure=false \
--destination s3://dest-bucket/backup \
--destination-endpoint localhost:9101 \
--destination-access-key access_key \
--destination-secret-key secret_key \
--destination-secure=false \
--skip .md5 \
--parallel 2 \
--uiConfig file equivalent:
# cross-cluster.yml
source:
url: s3://source-bucket
endpoint: localhost:9001
access_key: access_key
secret_key: secret_key
secure: false
destination:
url: s3://dest-bucket/backup
endpoint: localhost:9101
access_key: access_key
secret_key: secret_key
secure: false
policy:
skip:
- .md5
options:
parallel: 2
output:
ui: trueRestricted IAM Permissions
When the source IAM identity does not have s3:GetObjectTagging permission, use --skip-tags to skip reading object tags instead of failing:
CLI command:
# S3 source with restricted IAM
godwit sync \
--source s3://my-bucket/data \
--source-endpoint source.storage.example.com \
--source-region us-east-1 \
--source-access-key ACCESS_KEY \
--source-secret-key SECRET_KEY \
--source-secure=true \
--skip-tags \
--destination ./downloads \
--state-path ./state.dbConfig file equivalent:
# restricted-iam.yml
source:
url: s3://my-bucket/data
endpoint: source.storage.example.com
region: us-east-1
access_key: ACCESS_KEY
secret_key: SECRET_KEY
secure: true
destination:
url: ./downloads
policy:
skip_tags: true
run:
state_path: ./state.dbPlan List & Inspect
Every sync run is tracked in the state database. Use plan commands to list past runs and inspect their detailed progress. All plan commands support --config (-f) so you can reuse the same YAML config file from your sync.
List All Runs
View every recorded run with status, object count, transferred bytes, duration, and failures. Point --state-path at your state database, or use -f to load it from a config file.
CLI command:
# List all sync runs
godwit plan list \
--state-path ./tmp/state.db
# Or reuse the same config from your sync
godwit plan list -f migration.ymlConfig file equivalent:
# migration.yml (shared config)
run:
state_path: ./tmp/state.db
# run_id: <run-id> ← used by inspect/objects
# source/destination config...Inspect a Run
Get a full summary for a specific run — total objects, pending, finished, failed counts and data transferred vs. remaining. Use -f to load run_id and state_path from a config file.
CLI command:
# Inspect a specific run
godwit plan inspect \
--run-id <run-id> \
--state-path ./tmp/state.db
# Or with config file
godwit plan inspect -f migration.ymlConfig file equivalent:
# migration.yml
run:
run_id: <run-id>
state_path: ./tmp/state.dbList Plan Objects
List objects in a run filtered by status. Combine statuses with + (e.g. pending+running) and optionally filter by --storage-class. Use -f to load run_id and state_path from a config file.
CLI command:
# List all objects in a run
godwit plan list objects all \
--run-id <run-id> \
--state-path ./tmp/state.db
# Or with config file
godwit plan list objects all -f migration.yml
# List pending + running objects
godwit plan list objects pending+running \
-f migration.yml
# Filter by storage class
godwit plan list objects all \
--storage-class GLACIER \
-f migration.ymlConfig file equivalent:
# migration.yml
run:
run_id: <run-id>
state_path: ./tmp/state.dbPlan Verify
After a sync completes, verify that all transferred objects match their expected checksums. Godwit compares each object's MD5 against its .md5 sidecar file at the destination.
Verify a Completed Run
Run checksum verification for all completed objects in a sync run. Point --destination at the same endpoint used during sync.
CLI command:
# Verify a completed sync run
godwit plan verify \
--run-id <run-id> \
--destination s3://dest-bucket/backup \
--destination-endpoint localhost:9000 \
--destination-access-key access_key \
--destination-secret-key secret_key \
--destination-secure=false \
--state-path ./tmp/state.db \
--uiConfig file equivalent:
# verify.yml
run:
run_id: <run-id>
state_path: ./tmp/state.db
destination:
url: s3://dest-bucket/backup
endpoint: localhost:9000
access_key: access_key
secret_key: secret_key
secure: false
output:
ui: trueResume Verification
If verification is interrupted, use --resume to skip already-verified objects and continue from where you left off.
CLI command:
# Resume an interrupted verification
godwit plan verify \
--run-id <run-id> \
--destination s3://dest-bucket/backup \
--destination-endpoint localhost:9000 \
--destination-access-key access_key \
--destination-secret-key secret_key \
--destination-secure=false \
--state-path ./tmp/state.db \
--resume --uiConfig file equivalent:
# verify-resume.yml
run:
run_id: <run-id>
state_path: ./tmp/state.db
destination:
url: s3://dest-bucket/backup
endpoint: localhost:9000
access_key: access_key
secret_key: secret_key
secure: false
output:
ui: true
# Pass --resume on the CLI:
# godwit plan verify -f verify-resume.yml --resumeRate Limiting and Throttling
Control transfer speed to avoid overwhelming source or destination systems. Essential for production environments.
Request Rate Limit
Cap the number of API requests per second with --rps. Useful when the source or destination has strict request quotas.
CLI command:
# Limit to 50 requests per second
godwit sync \
--source s3://source-bucket \
--destination s3://dest-bucket \
--rps 50 \
...Config file equivalent:
# rps-limit.yml
rate_limit:
rps: 50
# ... source/destination configBandwidth Limit
Cap the read throughput in bytes per second with --read-bps. Prevents saturating network links during business hours.
CLI command:
# Limit read bandwidth to 100MB/s
godwit sync \
--source s3://source-bucket \
--destination s3://dest-bucket \
--read-bps 104857600 \
...Config file equivalent:
# bandwidth-limit.yml
rate_limit:
read_bps: 104857600 # 100MB/s
# ... source/destination configConcurrent Upload Limit
Cap the number of concurrent in-flight uploads with --max-inflight. Helps avoid overwhelming the destination when objects are large.
CLI command:
# Limit concurrent uploads
godwit sync \
--source s3://source-bucket \
--destination s3://dest-bucket \
--max-inflight 10 \
...Config file equivalent:
# inflight-limit.yml
rate_limit:
max_inflight: 10
# ... source/destination configCombined Limits
Combine multiple limits for fine-grained control. All limits are enforced simultaneously — the most restrictive one wins at any moment.
CLI command:
# Combine limits for fine-grained control
godwit sync \
--source s3://production-bucket \
--destination s3://backup-bucket \
--rps 25 \
--read-bps 52428800 \
--parallel 2 \
--max-inflight 5 \
...Config file equivalent:
# production-limits.yml
rate_limit:
rps: 25
read_bps: 52428800 # 50MB/s
max_inflight: 5
options:
parallel: 2
# ... source/destination configVersion History Migration
Transfer all versions of every object, not just the latest. When versioned buckets contain objects in cold storage classes (GLACIER, DEEP_ARCHIVE, GLACIER_IR), Godwit automatically skips those versions and reports on completeness.
Transfer All Versions
Use --version-mode all to enumerate and transfer every version of every object from the source bucket. Each version is individually compared and transferred to the destination.
CLI command:
# Transfer all object versions from source to destination
godwit sync \
--source s3://source-bucket/data \
--destination s3://dest-bucket/data \
--source-endpoint source.storage.example.com \
--source-access-key SOURCE_KEY \
--source-secret-key SOURCE_SECRET \
--destination-endpoint dest.storage.example.com \
--destination-access-key DEST_KEY \
--destination-secret-key DEST_SECRET \
--version-mode all \
--state-path ./tmp/state.db \
--logs-dir ./tmp/logs \
--briefConfig file equivalent:
# version-history.yml
source:
url: s3://source-bucket/data
endpoint: source.storage.example.com
access_key: SOURCE_KEY
secret_key: SOURCE_SECRET
destination:
url: s3://dest-bucket/data
endpoint: dest.storage.example.com
access_key: DEST_KEY
secret_key: DEST_SECRET
versioning:
mode: all
run:
state_path: ./tmp/state.db
output:
logs_dir: ./tmp/logs
brief: trueTransfer Versions Since a Date
Use --version-mode "since:<RFC3339>" to transfer only versions created after a specific timestamp. This is useful for incremental version backups where you only need recent changes.
CLI command:
# Transfer only versions created after a specific date
godwit sync \
--source s3://source-bucket/data \
--destination s3://dest-bucket/data \
--version-mode "since:2025-01-01T00:00:00Z" \
--state-path ./tmp/state.db \
...Config file equivalent:
# version-since.yml
versioning:
mode: "since:2025-01-01T00:00:00Z"
# ... source/destination configGlacier and Cold Storage Handling
When --version-mode all encounters objects in GLACIER, DEEP_ARCHIVE, or GLACIER_IR storage classes, those versions are automatically skipped (they require a restore before they can be read). Godwit warns about glacier objects during planning and reports partial version history after completion.
# Sync versioned bucket with mixed storage classes
# Glacier/Deep Archive versions are automatically skipped
godwit sync \
--source s3://source-bucket/data \
--destination s3://dest-bucket/data \
--version-mode all \
--state-path ./tmp/state.db \
--brief \
...
# Example output:
# Planning...
# ⚠ Warning: 9 GLACIER objects detected.
# Restore required before migration.
# Uploading...
# ⚠ 3 keys have partial version history
# (some versions skipped due to Glacier storage class)
# Version History: 4 complete, 3 partial, 1 fully skippedVersion History Outcomes
After a versioned sync, each key is classified into one of three outcomes based on how its versions were handled:
All versions of this key were transferred successfully. No versions were in cold storage.
Some versions were transferred, but others were skipped due to Glacier/Deep Archive storage class. The key's history is incomplete at the destination.
All versions of this key are in cold storage. Nothing was transferred for this key.
Inspecting Version History
After a versioned sync, use plan inspect to see version history completeness, and plan list objects with --partial-history to identify keys with incomplete history.
# Inspect version history completeness for a run
godwit plan inspect --run-id <run-id> --state-path ./tmp/state.db
# Example output:
# Version History:
# Complete History: 4 keys
# Partial History: 3 keys ⚠
# Fully Skipped: 1 keys
#
# Storage classes detected:
# STANDARD: 60.9% 14 objects 144 B
# GLACIER: 39.1% 9 objects 143 B
# List keys with partial version history
godwit plan list objects all --partial-history \
--run-id <run-id> --state-path ./tmp/state.db
# List all glacier-skipped objects
godwit plan list objects glacier \
--run-id <run-id> --state-path ./tmp/state.dbStorage Class Behavior
How each S3 storage class is handled during versioned transfers:
| Storage Class | Behavior |
|---|---|
| STANDARD | Transferred normally. All versions are eligible for copy. |
| STANDARD_IA | Transferred normally. Same as STANDARD but with different S3 pricing. |
| GLACIER | Skipped automatically. Objects must be restored to STANDARD before transfer. Versions are marked with glacier status. |
| DEEP_ARCHIVE | Skipped automatically. Longest restore time (up to 48 hours). Same skip behavior as GLACIER. |
| GLACIER_IR | Skipped automatically. Despite faster retrieval than GLACIER, still requires a restore operation before transfer. |
Object Lock Preservation
Replicate Object Lock retention modes and legal hold settings from source to destination. When enabled, Godwit reads each object's lock configuration and applies the same retention and legal hold at the destination.
Enabling Object Lock
Add --object-lock to your sync command. The destination bucket must have Object Lock enabled. Godwit Sync reads each object's retention policy and legal hold from the source and applies them when writing to the destination.
CLI command:
# Sync with Object Lock preservation
godwit sync \
--source s3://source-bucket/data \
--destination s3://dest-bucket/data \
--source-endpoint source.storage.example.com \
--source-access-key SOURCE_KEY \
--source-secret-key SOURCE_SECRET \
--destination-endpoint dest.storage.example.com \
--destination-access-key DEST_KEY \
--destination-secret-key DEST_SECRET \
--version-mode all \
--object-lock \
--state-path ./tmp/state.db \
--briefConfig file equivalent:
# object-lock.yml
source:
url: s3://source-bucket/data
endpoint: source.storage.example.com
access_key: SOURCE_KEY
secret_key: SOURCE_SECRET
destination:
url: s3://dest-bucket/data
endpoint: dest.storage.example.com
access_key: DEST_KEY
secret_key: DEST_SECRET
versioning:
mode: all
object_lock:
enabled: true
run:
state_path: ./tmp/state.db
output:
brief: trueRetention Modes
Godwit Sync preserves each version's lock type. After a sync, every version is classified by its lock configuration:
Retention with bypass. Privileged users can override the lock before the retain-until date.
Strict retention. No user, including root, can delete or shorten the retention period.
Indefinite hold independent of retention. Must be explicitly removed before the object can be deleted.
No Object Lock configuration on this version. Transferred without any lock settings.
Inspecting Object Lock Status
After a sync with --object-lock, use plan inspect to see a breakdown of lock types across all transferred versions.
# Inspect Object Lock statistics for a run
godwit plan inspect --run-id <run-id> --state-path ./tmp/state.db
# Example output:
# Object Lock:
# Governance: 12 versions
# Compliance: 4 versions
# Legal Hold: 2 versions
# None: 38 versionsResume and Recovery
Safely interrupt and resume transfers. The state database tracks progress so you never lose work.
CLI command:
# Step 1: Plan the transfer
godwit sync \
--source s3://large-bucket \
--destination s3://backup-bucket \
--state-path ./migration.db \
--plan-only \
...
# Step 2: Start execution (can be interrupted with Ctrl+C)
godwit sync \
--source s3://large-bucket \
--destination s3://backup-bucket \
--state-path ./migration.db \
--resume \
...
# Step 3: After interruption, resume from where you left off
godwit sync \
--source s3://large-bucket \
--destination s3://backup-bucket \
--state-path ./migration.db \
--resume \
...Config file equivalent:
# migration.yml
source:
url: s3://large-bucket
# access_key, secret_key, endpoint...
destination:
url: s3://backup-bucket
# access_key, secret_key, endpoint...
run:
state_path: ./migration.db
# Step 1: Plan only
# godwit sync -f migration.yml --plan-only
# Step 2: Start execution
# godwit sync -f migration.yml --resume
# Step 3: Resume after interruption
# godwit sync -f migration.yml --resumePrometheus Monitoring
Integrate Godwit Sync with your Prometheus monitoring stack.
CLI command:
# Run sync with metrics enabled
godwit sync \
--source ./data \
--destination s3://my-bucket/backup \
--destination-endpoint localhost:9000 \
--destination-access-key access_key \
--destination-secret-key secret_key \
--destination-secure=false \
--status-addr :8080 \
--drain-timeout 30Config file equivalent:
# prometheus-sync.yml
source:
url: ./data
destination:
url: s3://my-bucket/backup
endpoint: localhost:9000
access_key: access_key
secret_key: secret_key
secure: false
status:
addr: ":8080"
drain_timeout: 30Add a scrape job to your Prometheus configuration:
# prometheus.yml
scrape_configs:
- job_name: godwit
static_configs:
- targets: ["localhost:8080"]Available Endpoints
/metrics
Prometheus-format metrics including counters, histograms, and ETA gauge.
Prometheus Metrics →