Get 50 GB free!Register for a free account and start migrating today — no credit card required.Register now →
GodwitGodwit Sync
HomePricingDownloadsDocs
Customer PortalGet Started
Get Started
Guides

Step-by-Step Guides

Practical tutorials for common migration scenarios and workflows.

Local Filesystem to S3

Upload a local directory to an S3-compatible bucket. Perfect for initial data seeding or backup workflows.

CLI command:

# Upload local data to S3
godwit sync \
  --source ./testdata/ \
  --destination s3://my-bucket/backup \
  --destination-access-key access_key \
  --destination-secret-key secret_key \
  --destination-endpoint localhost:9001 \
  --destination-secure=false \
  --state-path ./tmp/state.db \
  --logs-dir ./tmp/logs \
  --ui

Config file equivalent:

# local-to-s3.yml
source:
  url: ./testdata/

destination:
  url: s3://my-bucket/backup
  access_key: access_key
  secret_key: secret_key
  endpoint: localhost:9001
  secure: false

run:
  state_path: ./tmp/state.db

output:
  logs_dir: ./tmp/logs
  ui: true

S3 to Local Filesystem

Download objects from an S3 bucket to your local filesystem. Useful for creating local backups or restoring data.

CLI command:

# Pull bucket contents to local disk
godwit sync \
  --source s3://my-bucket/data \
  --source-access-key access_key \
  --source-secret-key secret_key \
  --source-endpoint localhost:9001 \
  --source-secure=false \
  --destination ./downloads/ \
  --state-path ./tmp/state.db \
  --logs-dir ./tmp/logs \
  --ui

Config file equivalent:

# s3-to-local.yml
source:
  url: s3://my-bucket/data
  access_key: access_key
  secret_key: secret_key
  endpoint: localhost:9001
  secure: false

destination:
  url: ./downloads/

run:
  state_path: ./tmp/state.db

output:
  logs_dir: ./tmp/logs
  ui: true

S3 to S3 Migration

Copy data between S3-compatible endpoints. Transfers are performed fully on-the-fly without local storage - data streams directly from source to destination.

CLI command:

# Copy between S3-compatible endpoints
godwit sync \
  --source s3://source-bucket/data \
  --source-endpoint source.storage.example.com \
  --source-region us-east-1 \
  --source-access-key SOURCE_ACCESS_KEY \
  --source-secret-key SOURCE_SECRET_KEY \
  --source-secure=true \
  --destination s3://dest-bucket/backup \
  --destination-endpoint localhost:9000 \
  --destination-access-key access_key \
  --destination-secret-key secret_key \
  --destination-secure=false \
  --state-path ./tmp/state.db \
  --logs-dir ./tmp/logs

Config file equivalent:

# s3-to-s3.yml
source:
  url: s3://source-bucket/data
  endpoint: source.storage.example.com
  region: us-east-1
  access_key: SOURCE_ACCESS_KEY
  secret_key: SOURCE_SECRET_KEY
  secure: true

destination:
  url: s3://dest-bucket/backup
  endpoint: localhost:9000
  access_key: access_key
  secret_key: secret_key
  secure: false

run:
  state_path: ./tmp/state.db

output:
  logs_dir: ./tmp/logs

Cross-Cluster S3 Transfer

Copy between two S3-compatible endpoints with filtering and worker limits:

CLI command:

# Copy between S3-compatible endpoints
# Skip checksum files and limit to 2 workers
godwit sync \
  --source s3://source-bucket \
  --source-endpoint localhost:9001 \
  --source-access-key access_key \
  --source-secret-key secret_key \
  --source-secure=false \
  --destination s3://dest-bucket/backup \
  --destination-endpoint localhost:9101 \
  --destination-access-key access_key \
  --destination-secret-key secret_key \
  --destination-secure=false \
  --skip .md5 \
  --parallel 2 \
  --ui

Config file equivalent:

# cross-cluster.yml
source:
  url: s3://source-bucket
  endpoint: localhost:9001
  access_key: access_key
  secret_key: secret_key
  secure: false

destination:
  url: s3://dest-bucket/backup
  endpoint: localhost:9101
  access_key: access_key
  secret_key: secret_key
  secure: false

policy:
  skip:
    - .md5

options:
  parallel: 2

output:
  ui: true

Restricted IAM Permissions

When the source IAM identity does not have s3:GetObjectTagging permission, use --skip-tags to skip reading object tags instead of failing:

CLI command:

# S3 source with restricted IAM
godwit sync \
  --source s3://my-bucket/data \
  --source-endpoint source.storage.example.com \
  --source-region us-east-1 \
  --source-access-key ACCESS_KEY \
  --source-secret-key SECRET_KEY \
  --source-secure=true \
  --skip-tags \
  --destination ./downloads \
  --state-path ./state.db

Config file equivalent:

# restricted-iam.yml
source:
  url: s3://my-bucket/data
  endpoint: source.storage.example.com
  region: us-east-1
  access_key: ACCESS_KEY
  secret_key: SECRET_KEY
  secure: true

destination:
  url: ./downloads

policy:
  skip_tags: true

run:
  state_path: ./state.db

Plan List & Inspect

Every sync run is tracked in the state database. Use plan commands to list past runs and inspect their detailed progress. All plan commands support --config (-f) so you can reuse the same YAML config file from your sync.

List All Runs

View every recorded run with status, object count, transferred bytes, duration, and failures. Point --state-path at your state database, or use -f to load it from a config file.

CLI command:

# List all sync runs
godwit plan list \
  --state-path ./tmp/state.db

# Or reuse the same config from your sync
godwit plan list -f migration.yml

Config file equivalent:

# migration.yml (shared config)
run:
  state_path: ./tmp/state.db
  # run_id: <run-id>  ← used by inspect/objects

# source/destination config...

Inspect a Run

Get a full summary for a specific run — total objects, pending, finished, failed counts and data transferred vs. remaining. Use -f to load run_id and state_path from a config file.

CLI command:

# Inspect a specific run
godwit plan inspect \
  --run-id <run-id> \
  --state-path ./tmp/state.db

# Or with config file
godwit plan inspect -f migration.yml

Config file equivalent:

# migration.yml
run:
  run_id: <run-id>
  state_path: ./tmp/state.db

List Plan Objects

List objects in a run filtered by status. Combine statuses with + (e.g. pending+running) and optionally filter by --storage-class. Use -f to load run_id and state_path from a config file.

CLI command:

# List all objects in a run
godwit plan list objects all \
  --run-id <run-id> \
  --state-path ./tmp/state.db

# Or with config file
godwit plan list objects all -f migration.yml

# List pending + running objects
godwit plan list objects pending+running \
  -f migration.yml

# Filter by storage class
godwit plan list objects all \
  --storage-class GLACIER \
  -f migration.yml

Config file equivalent:

# migration.yml
run:
  run_id: <run-id>
  state_path: ./tmp/state.db

Plan Verify

After a sync completes, verify that all transferred objects match their expected checksums. Godwit compares each object's MD5 against its .md5 sidecar file at the destination.

Verify a Completed Run

Run checksum verification for all completed objects in a sync run. Point --destination at the same endpoint used during sync.

CLI command:

# Verify a completed sync run
godwit plan verify \
  --run-id <run-id> \
  --destination s3://dest-bucket/backup \
  --destination-endpoint localhost:9000 \
  --destination-access-key access_key \
  --destination-secret-key secret_key \
  --destination-secure=false \
  --state-path ./tmp/state.db \
  --ui

Config file equivalent:

# verify.yml
run:
  run_id: <run-id>
  state_path: ./tmp/state.db

destination:
  url: s3://dest-bucket/backup
  endpoint: localhost:9000
  access_key: access_key
  secret_key: secret_key
  secure: false

output:
  ui: true

Resume Verification

If verification is interrupted, use --resume to skip already-verified objects and continue from where you left off.

CLI command:

# Resume an interrupted verification
godwit plan verify \
  --run-id <run-id> \
  --destination s3://dest-bucket/backup \
  --destination-endpoint localhost:9000 \
  --destination-access-key access_key \
  --destination-secret-key secret_key \
  --destination-secure=false \
  --state-path ./tmp/state.db \
  --resume --ui

Config file equivalent:

# verify-resume.yml
run:
  run_id: <run-id>
  state_path: ./tmp/state.db

destination:
  url: s3://dest-bucket/backup
  endpoint: localhost:9000
  access_key: access_key
  secret_key: secret_key
  secure: false

output:
  ui: true

# Pass --resume on the CLI:
# godwit plan verify -f verify-resume.yml --resume

Rate Limiting and Throttling

Control transfer speed to avoid overwhelming source or destination systems. Essential for production environments.

Request Rate Limit

Cap the number of API requests per second with --rps. Useful when the source or destination has strict request quotas.

CLI command:

# Limit to 50 requests per second
godwit sync \
  --source s3://source-bucket \
  --destination s3://dest-bucket \
  --rps 50 \
  ...

Config file equivalent:

# rps-limit.yml
rate_limit:
  rps: 50

# ... source/destination config

Bandwidth Limit

Cap the read throughput in bytes per second with --read-bps. Prevents saturating network links during business hours.

CLI command:

# Limit read bandwidth to 100MB/s
godwit sync \
  --source s3://source-bucket \
  --destination s3://dest-bucket \
  --read-bps 104857600 \
  ...

Config file equivalent:

# bandwidth-limit.yml
rate_limit:
  read_bps: 104857600  # 100MB/s

# ... source/destination config

Concurrent Upload Limit

Cap the number of concurrent in-flight uploads with --max-inflight. Helps avoid overwhelming the destination when objects are large.

CLI command:

# Limit concurrent uploads
godwit sync \
  --source s3://source-bucket \
  --destination s3://dest-bucket \
  --max-inflight 10 \
  ...

Config file equivalent:

# inflight-limit.yml
rate_limit:
  max_inflight: 10

# ... source/destination config

Combined Limits

Combine multiple limits for fine-grained control. All limits are enforced simultaneously — the most restrictive one wins at any moment.

CLI command:

# Combine limits for fine-grained control
godwit sync \
  --source s3://production-bucket \
  --destination s3://backup-bucket \
  --rps 25 \
  --read-bps 52428800 \
  --parallel 2 \
  --max-inflight 5 \
  ...

Config file equivalent:

# production-limits.yml
rate_limit:
  rps: 25
  read_bps: 52428800   # 50MB/s
  max_inflight: 5

options:
  parallel: 2

# ... source/destination config

Version History Migration

Transfer all versions of every object, not just the latest. When versioned buckets contain objects in cold storage classes (GLACIER, DEEP_ARCHIVE, GLACIER_IR), Godwit automatically skips those versions and reports on completeness.

Transfer All Versions

Use --version-mode all to enumerate and transfer every version of every object from the source bucket. Each version is individually compared and transferred to the destination.

CLI command:

# Transfer all object versions from source to destination
godwit sync \
  --source s3://source-bucket/data \
  --destination s3://dest-bucket/data \
  --source-endpoint source.storage.example.com \
  --source-access-key SOURCE_KEY \
  --source-secret-key SOURCE_SECRET \
  --destination-endpoint dest.storage.example.com \
  --destination-access-key DEST_KEY \
  --destination-secret-key DEST_SECRET \
  --version-mode all \
  --state-path ./tmp/state.db \
  --logs-dir ./tmp/logs \
  --brief

Config file equivalent:

# version-history.yml
source:
  url: s3://source-bucket/data
  endpoint: source.storage.example.com
  access_key: SOURCE_KEY
  secret_key: SOURCE_SECRET

destination:
  url: s3://dest-bucket/data
  endpoint: dest.storage.example.com
  access_key: DEST_KEY
  secret_key: DEST_SECRET

versioning:
  mode: all

run:
  state_path: ./tmp/state.db

output:
  logs_dir: ./tmp/logs
  brief: true

Transfer Versions Since a Date

Use --version-mode "since:<RFC3339>" to transfer only versions created after a specific timestamp. This is useful for incremental version backups where you only need recent changes.

CLI command:

# Transfer only versions created after a specific date
godwit sync \
  --source s3://source-bucket/data \
  --destination s3://dest-bucket/data \
  --version-mode "since:2025-01-01T00:00:00Z" \
  --state-path ./tmp/state.db \
  ...

Config file equivalent:

# version-since.yml
versioning:
  mode: "since:2025-01-01T00:00:00Z"

# ... source/destination config

Glacier and Cold Storage Handling

When --version-mode all encounters objects in GLACIER, DEEP_ARCHIVE, or GLACIER_IR storage classes, those versions are automatically skipped (they require a restore before they can be read). Godwit warns about glacier objects during planning and reports partial version history after completion.

# Sync versioned bucket with mixed storage classes
# Glacier/Deep Archive versions are automatically skipped
godwit sync \
  --source s3://source-bucket/data \
  --destination s3://dest-bucket/data \
  --version-mode all \
  --state-path ./tmp/state.db \
  --brief \
  ...

# Example output:
# Planning...
# ⚠ Warning: 9 GLACIER objects detected.
#   Restore required before migration.
# Uploading...
# ⚠ 3 keys have partial version history
#   (some versions skipped due to Glacier storage class)
# Version History: 4 complete, 3 partial, 1 fully skipped

Version History Outcomes

After a versioned sync, each key is classified into one of three outcomes based on how its versions were handled:

Complete History

All versions of this key were transferred successfully. No versions were in cold storage.

Partial History ⚠

Some versions were transferred, but others were skipped due to Glacier/Deep Archive storage class. The key's history is incomplete at the destination.

Fully Skipped

All versions of this key are in cold storage. Nothing was transferred for this key.

Inspecting Version History

After a versioned sync, use plan inspect to see version history completeness, and plan list objects with --partial-history to identify keys with incomplete history.

# Inspect version history completeness for a run
godwit plan inspect --run-id <run-id> --state-path ./tmp/state.db

# Example output:
# Version History:
#   Complete History:      4 keys
#   Partial History:       3 keys    ⚠
#   Fully Skipped:         1 keys
#
# Storage classes detected:
#   STANDARD:             60.9%   14 objects   144 B
#   GLACIER:              39.1%   9 objects   143 B

# List keys with partial version history
godwit plan list objects all --partial-history \
  --run-id <run-id> --state-path ./tmp/state.db

# List all glacier-skipped objects
godwit plan list objects glacier \
  --run-id <run-id> --state-path ./tmp/state.db

Storage Class Behavior

How each S3 storage class is handled during versioned transfers:

Storage ClassBehavior
STANDARDTransferred normally. All versions are eligible for copy.
STANDARD_IATransferred normally. Same as STANDARD but with different S3 pricing.
GLACIERSkipped automatically. Objects must be restored to STANDARD before transfer. Versions are marked with glacier status.
DEEP_ARCHIVESkipped automatically. Longest restore time (up to 48 hours). Same skip behavior as GLACIER.
GLACIER_IRSkipped automatically. Despite faster retrieval than GLACIER, still requires a restore operation before transfer.

Object Lock Preservation

Replicate Object Lock retention modes and legal hold settings from source to destination. When enabled, Godwit reads each object's lock configuration and applies the same retention and legal hold at the destination.

Enabling Object Lock

Add --object-lock to your sync command. The destination bucket must have Object Lock enabled. Godwit Sync reads each object's retention policy and legal hold from the source and applies them when writing to the destination.

CLI command:

# Sync with Object Lock preservation
godwit sync \
  --source s3://source-bucket/data \
  --destination s3://dest-bucket/data \
  --source-endpoint source.storage.example.com \
  --source-access-key SOURCE_KEY \
  --source-secret-key SOURCE_SECRET \
  --destination-endpoint dest.storage.example.com \
  --destination-access-key DEST_KEY \
  --destination-secret-key DEST_SECRET \
  --version-mode all \
  --object-lock \
  --state-path ./tmp/state.db \
  --brief

Config file equivalent:

# object-lock.yml
source:
  url: s3://source-bucket/data
  endpoint: source.storage.example.com
  access_key: SOURCE_KEY
  secret_key: SOURCE_SECRET

destination:
  url: s3://dest-bucket/data
  endpoint: dest.storage.example.com
  access_key: DEST_KEY
  secret_key: DEST_SECRET

versioning:
  mode: all

object_lock:
  enabled: true

run:
  state_path: ./tmp/state.db

output:
  brief: true

Retention Modes

Godwit Sync preserves each version's lock type. After a sync, every version is classified by its lock configuration:

GOVERNANCE

Retention with bypass. Privileged users can override the lock before the retain-until date.

COMPLIANCE

Strict retention. No user, including root, can delete or shorten the retention period.

Legal Hold

Indefinite hold independent of retention. Must be explicitly removed before the object can be deleted.

None

No Object Lock configuration on this version. Transferred without any lock settings.

Inspecting Object Lock Status

After a sync with --object-lock, use plan inspect to see a breakdown of lock types across all transferred versions.

# Inspect Object Lock statistics for a run
godwit plan inspect --run-id <run-id> --state-path ./tmp/state.db

# Example output:
# Object Lock:
#   Governance:     12 versions
#   Compliance:      4 versions
#   Legal Hold:      2 versions
#   None:           38 versions

Resume and Recovery

Safely interrupt and resume transfers. The state database tracks progress so you never lose work.

CLI command:

# Step 1: Plan the transfer
godwit sync \
  --source s3://large-bucket \
  --destination s3://backup-bucket \
  --state-path ./migration.db \
  --plan-only \
  ...

# Step 2: Start execution (can be interrupted with Ctrl+C)
godwit sync \
  --source s3://large-bucket \
  --destination s3://backup-bucket \
  --state-path ./migration.db \
  --resume \
  ...

# Step 3: After interruption, resume from where you left off
godwit sync \
  --source s3://large-bucket \
  --destination s3://backup-bucket \
  --state-path ./migration.db \
  --resume \
  ...

Config file equivalent:

# migration.yml
source:
  url: s3://large-bucket
  # access_key, secret_key, endpoint...

destination:
  url: s3://backup-bucket
  # access_key, secret_key, endpoint...

run:
  state_path: ./migration.db

# Step 1: Plan only
# godwit sync -f migration.yml --plan-only

# Step 2: Start execution
# godwit sync -f migration.yml --resume

# Step 3: Resume after interruption
# godwit sync -f migration.yml --resume

Prometheus Monitoring

Integrate Godwit Sync with your Prometheus monitoring stack.

CLI command:

# Run sync with metrics enabled
godwit sync \
  --source ./data \
  --destination s3://my-bucket/backup \
  --destination-endpoint localhost:9000 \
  --destination-access-key access_key \
  --destination-secret-key secret_key \
  --destination-secure=false \
  --status-addr :8080 \
  --drain-timeout 30

Config file equivalent:

# prometheus-sync.yml
source:
  url: ./data

destination:
  url: s3://my-bucket/backup
  endpoint: localhost:9000
  access_key: access_key
  secret_key: secret_key
  secure: false

status:
  addr: ":8080"
  drain_timeout: 30

Add a scrape job to your Prometheus configuration:

# prometheus.yml
scrape_configs:
  - job_name: godwit
    static_configs:
      - targets: ["localhost:8080"]

Available Endpoints

/metrics

Prometheus-format metrics including counters, histograms, and ETA gauge.

Prometheus Metrics →

/status

JSON summary of the current run with progress and statistics.

Status Endpoint →
← Quick StartCLI Reference →

On this page

  • Local Filesystem to S3
  • S3 to Local Filesystem
  • S3 to S3 Migration
    • Cross-Cluster S3 Transfer
    • Restricted IAM Permissions
  • Plan List & Inspect
    • List All Runs
    • Inspect a Run
    • List Plan Objects
  • Plan Verify
    • Verify a Completed Run
    • Resume Verification
  • Rate Limiting and Throttling
    • Request Rate Limit
    • Bandwidth Limit
    • Concurrent Upload Limit
    • Combined Limits
  • Version History Migration
    • Transfer All Versions
    • Transfer Versions Since a Date
    • Glacier and Cold Storage Handling
    • Version History Outcomes
    • Inspecting Version History
    • Storage Class Behavior
  • Object Lock Preservation
    • Enabling Object Lock
    • Retention Modes
    • Inspecting Object Lock Status
  • Resume and Recovery
  • Prometheus Monitoring
    • Available Endpoints
Godwit Sync

Production-grade data migration and synchronization for large object storage. Control, predictability, and safety at scale.

Product

  • Pricing
  • Documentation
  • Changelog

Legal

  • Terms of Service
  • User Agreement
  • Privacy Policy

© 2026 Godwit Sync. All rights reserved.

Version v1.0.29