AWS Architecture Patterns for Spring Boot Applications - Writing

Introduction

Running Spring Boot applications on AWS is straightforward when you have a single service and a relational database. The complexity emerges when you need to scale horizontally, handle bursty workloads, maintain database performance under load, and integrate multiple AWS services without your architecture becoming a tangled mess of IAM roles and VPC configurations.

This article covers the patterns I have settled on after running Spring Boot microservices on AWS for several years. These are not theoretical reference architectures; they are the actual configurations and patterns running in production today, refined through trial and error.

ECS Fargate: The Right Abstraction for Most Teams

I use ECS Fargate for the vast majority of Spring Boot services. Kubernetes is more flexible, but for teams that are primarily application engineers rather than platform engineers, ECS provides a dramatically simpler operational model. You get container orchestration, service discovery, auto-scaling, and load balancing without managing a control plane.

Here is a Terraform configuration that captures the patterns I use for a typical Spring Boot service:

resource "aws_ecs_task_definition" "translation_service" {
  family                   = "translation-service"
  requires_compatibilities = ["FARGATE"]
  network_mode             = "awsvpc"
  cpu                      = 1024
  memory                   = 2048
  execution_role_arn       = aws_iam_role.ecs_execution.arn
  task_role_arn            = aws_iam_role.translation_service_task.arn

  container_definitions = jsonencode([{
    name  = "translation-service"
    image = "${aws_ecr_repository.translation_service.repository_url}:${var.image_tag}"

    portMappings = [{
      containerPort = 8080
      protocol      = "tcp"
    }]

    environment = [
      { name = "SPRING_PROFILES_ACTIVE", value = var.environment },
      { name = "SERVER_PORT", value = "8080" },
      { name = "JAVA_OPTS", value = "-XX:+UseZGC -XX:MaxRAMPercentage=75.0 -XX:+UseStringDeduplication" }
    ]

    secrets = [
      { name = "SPRING_DATASOURCE_URL", valueFrom = "${aws_secretsmanager_secret.db_url.arn}" },
      { name = "SPRING_DATASOURCE_PASSWORD", valueFrom = "${aws_secretsmanager_secret.db_password.arn}" }
    ]

    healthCheck = {
      command     = ["CMD-SHELL", "curl -f http://localhost:8080/actuator/health || exit 1"]
      interval    = 30
      timeout     = 5
      retries     = 3
      startPeriod = 120
    }

    logConfiguration = {
      logDriver = "awslogs"
      options = {
        "awslogs-group"         = "/ecs/translation-service"
        "awslogs-region"        = var.region
        "awslogs-stream-prefix" = "ecs"
      }
    }
  }])
}

A few critical details in this configuration:

-XX:MaxRAMPercentage=75.0 instead of a fixed -Xmx. Fargate provides a fixed memory allocation, and your JVM needs to share that with the operating system, native memory, and thread stacks. Setting MaxRAMPercentage to 75% leaves enough headroom for these non-heap allocations. I have seen services crash with OOM kills because someone set -Xmx to 100% of the Fargate memory allocation.

startPeriod on health checks. Spring Boot applications take time to start, especially with Hibernate DDL validation, Flyway migrations, and connection pool initialization. Without a start period, ECS will kill containers that are still starting up, creating a restart loop.

Secrets from Secrets Manager, not environment variables. Environment variables are visible in the ECS console and task definition. Database credentials and API keys should always come from Secrets Manager or Parameter Store.

For auto-scaling, I combine target tracking on CPU with step scaling on custom CloudWatch metrics:

resource "aws_appautoscaling_policy" "cpu_scaling" {
  name               = "translation-service-cpu-scaling"
  service_namespace  = "ecs"
  resource_id        = "service/${aws_ecs_cluster.main.name}/${aws_ecs_service.translation.name}"
  scalable_dimension = "ecs:service:DesiredCount"
  policy_type        = "TargetTrackingScaling"

  target_tracking_scaling_policy_configuration {
    predefined_metric_specification {
      predefined_metric_type = "ECSServiceAverageCPUUtilization"
    }
    target_value       = 65.0
    scale_in_cooldown  = 300
    scale_out_cooldown = 60
  }
}

SQS for Asynchronous Decoupling

SQS is my default choice for asynchronous communication between services when I do not need Kafka’s replay capability, ordering guarantees, or stream processing features. SQS is simpler to operate, has no infrastructure to manage, and integrates naturally with Spring Boot through Spring Cloud AWS.

The pattern I use most is a request-response flow where the initial HTTP request validates and enqueues work, and a consumer processes it asynchronously:

@Service
@Slf4j
public class TranslationJobProducer {

    private final SqsTemplate sqsTemplate;
    private final ObjectMapper objectMapper;

    public String submitJob(TranslationJobRequest request) {
        String jobId = UUID.randomUUID().toString();

        TranslationJobMessage message = new TranslationJobMessage(
            jobId,
            request.sourceLanguage(),
            request.targetLanguage(),
            request.segments(),
            request.priority(),
            Instant.now()
        );

        sqsTemplate.send(to -> to
            .queue("translation-jobs-queue")
            .payload(message)
            .header("jobId", jobId)
            .header("priority", request.priority().name())
            .delaySeconds(0));

        log.info("Translation job {} submitted to queue", jobId);
        return jobId;
    }
}

@Component
@Slf4j
public class TranslationJobConsumer {

    private final TranslationService translationService;
    private final JobStatusRepository statusRepository;

    @SqsListener(
        value = "translation-jobs-queue",
        maxConcurrentMessages = "10",
        maxMessagesPerPoll = "5",
        pollTimeoutSeconds = "20"
    )
    public void processJob(@Payload TranslationJobMessage message,
                           @Header("jobId") String jobId) {
        log.info("Processing translation job {}", jobId);
        statusRepository.updateStatus(jobId, JobStatus.PROCESSING);

        try {
            TranslationResult result = translationService.translate(
                message.sourceLanguage(),
                message.targetLanguage(),
                message.segments()
            );
            statusRepository.updateStatus(jobId, JobStatus.COMPLETED);
            statusRepository.storeResult(jobId, result);
            log.info("Translation job {} completed successfully", jobId);

        } catch (Exception e) {
            log.error("Translation job {} failed", jobId, e);
            statusRepository.updateStatus(jobId, JobStatus.FAILED);
            throw e; // let SQS handle retry via visibility timeout
        }
    }
}

Dead letter queues are essential. Configure a DLQ on every SQS queue and set a reasonable maxReceiveCount. I typically use 3 retries before messages move to the DLQ:

resource "aws_sqs_queue" "translation_jobs" {
  name                       = "translation-jobs-queue"
  visibility_timeout_seconds = 300
  message_retention_seconds  = 1209600  # 14 days
  receive_wait_time_seconds  = 20       # long polling

  redrive_policy = jsonencode({
    deadLetterTargetArn = aws_sqs_queue.translation_jobs_dlq.arn
    maxReceiveCount     = 3
  })
}

resource "aws_sqs_queue" "translation_jobs_dlq" {
  name                      = "translation-jobs-dlq"
  message_retention_seconds = 1209600
}

RDS Performance Patterns

RDS PostgreSQL is my default database for Spring Boot services. The patterns that matter most are connection management, query optimization, and read scaling.

Connection pooling is critical. Every ECS task opens connections to the database, and with auto-scaling, the total connection count can spike quickly. I use HikariCP (Spring Boot’s default) with conservative settings and add PgBouncer as a connection pooler for services with high concurrency:

spring:
  datasource:
    hikari:
      maximum-pool-size: 10
      minimum-idle: 5
      connection-timeout: 5000
      idle-timeout: 300000
      max-lifetime: 1200000
      leak-detection-threshold: 30000

With 10 ECS tasks at 10 connections each, that is 100 database connections. RDS instances have connection limits based on instance size, and hitting that limit causes connection failures across all services. For larger deployments, I put RDS Proxy in front of the database:

resource "aws_db_proxy" "main" {
  name                   = "translation-platform-proxy"
  engine_family          = "POSTGRESQL"
  role_arn               = aws_iam_role.rds_proxy.arn
  vpc_subnet_ids         = var.private_subnet_ids
  require_tls            = true
  idle_client_timeout    = 1800

  auth {
    auth_scheme = "SECRETS"
    iam_auth    = "REQUIRED"
    secret_arn  = aws_secretsmanager_secret.db_credentials.arn
  }
}

For read scaling, I use Spring’s @Transactional(readOnly = true) routing to direct read queries to Aurora read replicas:

@Configuration
public class DataSourceConfig {

    @Bean
    public DataSource dataSource(
            @Qualifier("writerDataSource") DataSource writer,
            @Qualifier("readerDataSource") DataSource reader) {

        var routingDataSource = new AbstractRoutingDataSource() {
            @Override
            protected Object determineCurrentLookupKey() {
                return TransactionSynchronizationManager.isCurrentTransactionReadOnly()
                    ? "reader" : "writer";
            }
        };

        routingDataSource.setTargetDataSources(Map.of(
            "writer", writer,
            "reader", reader
        ));
        routingDataSource.setDefaultTargetDataSource(writer);
        return routingDataSource;
    }
}

This transparently routes @Transactional(readOnly = true) service methods to the read replica, distributing query load without any changes to your repository or service layer.

Lambda for Event-Driven Glue

I use Lambda sparingly and specifically: for event-driven integrations that do not warrant a full ECS service. Common patterns include processing S3 upload events, handling SQS messages that require minimal logic, and building scheduled tasks.

A pattern I use frequently is an S3 event trigger for file processing:

public class TranslationFileHandler implements RequestHandler<S3Event, String> {

    private final S3Client s3Client;
    private final SqsClient sqsClient;
    private final ObjectMapper objectMapper;

    @Override
    public String handleRequest(S3Event event, Context context) {
        for (S3EventNotification.S3EventNotificationRecord record : event.getRecords()) {
            String bucket = record.getS3().getBucket().getName();
            String key = record.getS3().getObject().getUrlDecodedKey();

            context.getLogger().log("Processing file: " + key);

            GetObjectRequest getRequest = GetObjectRequest.builder()
                .bucket(bucket)
                .key(key)
                .build();

            try (InputStream stream = s3Client.getObject(getRequest)) {
                List<String> segments = parseTranslationFile(stream);

                TranslationJobMessage job = new TranslationJobMessage(
                    UUID.randomUUID().toString(),
                    extractSourceLang(key),
                    extractTargetLang(key),
                    segments,
                    Priority.NORMAL,
                    Instant.now()
                );

                sqsClient.sendMessage(SendMessageRequest.builder()
                    .queueUrl(System.getenv("TRANSLATION_QUEUE_URL"))
                    .messageBody(objectMapper.writeValueAsString(job))
                    .build());

            } catch (Exception e) {
                context.getLogger().log("Error processing file: " + e.getMessage());
                throw new RuntimeException(e);
            }
        }
        return "Processed " + event.getRecords().size() + " files";
    }
}

Keep Lambda functions simple. If your Lambda needs a Spring context, database connection pool, or complex initialization, it probably should be an ECS service. Lambda’s cold start penalty is real, and Spring Boot’s startup time makes it worse. For the rare cases where I do use Spring Boot in Lambda, I use Spring Cloud Function with SnapStart to mitigate cold starts.

Key Takeaways

The patterns that work best for Spring Boot on AWS prioritize operational simplicity. ECS Fargate over self-managed Kubernetes. SQS over custom message brokers when you do not need Kafka’s features. RDS Proxy for connection management at scale. Lambda for lightweight event-driven glue, not for full application logic.

The most important architectural decision is where to draw the boundary between synchronous and asynchronous processing. Every operation that can tolerate even a few seconds of latency should be asynchronous via SQS. This makes your synchronous API layer fast and predictable, while the asynchronous processing layer handles the heavy work at its own pace with natural back-pressure through queue depth.