3 posts tagged with "serverless"

Host Database Serverlessly On AWS Fargate - Just Like Your Applications

December 26, 2024 · 4 min read

WeScale Contributor

Running a database in the cloud often means paying for idle capacity and dealing with persistent storage overhead. AWS ECS + Fargate changes that equation by letting you pay only for actual compute time and seamlessly integrate with S3 for storage, effectively separating compute from data. This guide shows you how to host a MySQL-compatible database (WeSQL) using Fargate’s pay-as-you-go model while storing your data durably on S3—so you don’t pay for idle compute.

What We’ll Build

By following this guide, you’ll launch a MySQL-compatible database that:

Runs on AWS Fargate (no dedicated servers)
Uses S3 for permanent data storage
Starts/stops on demand, ensuring you only pay when it’s actually running

Here’s the architecture and resource dependencies:

images/ecs-dependencies.svg

Prerequisites

An AWS account with appropriate permissions
Basic understanding of AWS VPC, ECS, IAM
A VPC with public subnets and security groups configured

Networking Requirements

A VPC with a public subnet
An Internet Gateway attached
A security group allowing:
- Inbound: TCP port 3306 (MySQL)
- Outbound: All traffic

Create S3 Bucket

Create an S3 bucket for database files—this is where WeSQL will store all data.

images/create_s3_bucket.png

Create IAM Roles

Next, create an IAM role to grant ECS tasks permission to access the S3 bucket. Attach this role to your task definition.

images/create_iam_role.png

Create Log Group

Set up a CloudWatch Log Group to capture container logs:

images/log_group.png

Create ECS Cluster

Create an ECS cluster to organize Fargate tasks:

images/create_cluster.png

Create ECS Task Definition

Define how your WeSQL container runs in the ECS Task Definition:

images/create_task_definition.png

Below is an example JSON for WeSQL:

{
    "family": "wesql-fargate-task-def",
    "taskRoleArn": "<your-task-role-with-s3-permission>",
    "executionRoleArn": "<ecs-default-ecsTaskExecutionRole>",
    "networkMode": "awsvpc",
    "requiresCompatibilities": ["FARGATE"],
    "cpu": "1024",
    "memory": "2048",
    "containerDefinitions": [
        {
            "name": "wesql-server",
            "image": "apecloud/wesql-server:8.0.35-0.1.0_beta2.37",
            "cpu": 1024,
            "memory": 2048,
            "portMappings": [
                {
                    "containerPort": 3306,
                    "hostPort": 3306,
                    "protocol": "tcp"
                }
            ],
            "essential": true,
            "environment": [
                {
                    "name": "MYSQL_CUSTOM_CONFIG",
                    "value": "[mysqld]\nport=3306\nlog-bin=binlog\ngtid_mode=ON\nenforce_gtid_consistency=ON\nlog_slave_updates=ON\nbinlog_format=ROW\nobjectstore_provider=aws\nrepo_objectstore_id=tutorial\nbranch_objectstore_id=main\nsmartengine_persistent_cache_size=1G"
                },
                {
                    "name": "WESQL_OBJECTSTORE_ACCESS_KEY",
                    "value": "<REPLACE_ME>"
                },
                {
                    "name": "WESQL_DATA_DIR",
                    "value": "/data/mysql/data"
                },
                {
                    "name": "WESQL_OBJECTSTORE_SECRET_KEY",
                    "value": "<REPLACE_ME>"
                },
                {
                    "name": "MYSQL_ROOT_PASSWORD",
                    "value": "passwd"
                },
                {
                    "name": "WESQL_CLUSTER_MEMBER",
                    "value": "127.0.0.1:13306"
                },
                {
                    "name": "WESQL_OBJECTSTORE_REGION",
                    "value": "us-west-2"
                },
                {
                    "name": "WESQL_LOG_DIR",
                    "value": "/data/mysql/log"
                },
                {
                    "name": "WESQL_OBJECTSTORE_BUCKET",
                    "value": "wesql-fargate-test"
                }
            ],
            "logConfiguration": {
                "logDriver": "awslogs",
                "options": {
                    "awslogs-group": "/ecs/wesql-fargate-task-def",
                    "awslogs-region": "us-west-2",
                    "awslogs-stream-prefix": "ecs"
                }
            }
        }
    ]
}

Create ECS Service

Now create an ECS Service to keep the task running:

images/create_service2.png

Select the right subnets and security group:

images/create_service3.png

Check logs in CloudWatch to confirm the database is running:

images/create_service5.png

Connect to the Database

With the service active, you can connect to WeSQL using any MySQL client:

mysql -h <FARGATE_PUBLIC_IP> -P 3306 -uroot -ppasswd

images/connect_to_wesql.png

Pause and Resume the Database

WeSQL uses S3 for storage, so the database container in Fargate is stateless and can be stopped and restarted without losing data. This is ideal for saving costs—simply stop the database when not in use:

# Pause the database (set desired count to 0)
aws ecs update-service \
    --cluster <your-cluster-name> \
    --service <your-service-name> \
    --desired-count 0

# Resume the database (set desired count to 1)
aws ecs update-service \
    --cluster <your-cluster-name> \
    --service <your-service-name> \
    --desired-count 1

Or automate this with AWS Lambda:

import boto3

ecs = boto3.client('ecs')

def lambda_handler(event, context):
    action = event.get('action')
    cluster = event.get('cluster')
    service = event.get('service')
    desired_count = 0 if action == 'pause' else 1

    ecs.update_service(
        cluster=cluster,
        service=service,
        desiredCount=desired_count
    )

Other Considerations

Service Discovery: Use AWS Service Discovery for a stable DNS endpoint.
Secrets Management: Store credentials in AWS Secrets Manager.
Scheduled Start/Stop: Use AWS EventBridge to automate cost-saving start/stop schedules.

Automation with Python Scripts

Check out this simple Python script using AWS SDK (boto3) to automate resource creation. You can also adapt it or use Terraform, CloudFormation, etc.

By combining Fargate’s pay-as-you-go model with S3-based storage, you get a flexible “serverless” MySQL experience—no idle compute charges, no bulky EBS volumes, just on-demand database hosting that fits your actual usage.

WeSQL Outperforms AWS RDS MySQL Single AZ -- 4-6x the Speed, 1/2 the cost

December 17, 2024 · 7 min read

Dongsheng Zhao

SmartEngine:The Next-Generation Cloud-Native Storage Engine

When people hear "S3" and "OLTP database" in the same sentence, skepticism is a common reaction. S3 is known for its low cost, but its performance and latency are often seen as unsuitable for the demands of OLTP workloads.

At WeSQL, we’ve previously explained our approach in articles like Why S3 and Persistent Cache. We use S3 for its low cost, reliability, and scalability as durable storage, while addressing performance and latency challenges with efficient write and read caching mechanisms.

Still, questions remain: Can an S3-based OLTP database perform well in practical use cases?

In this blog, we’ll show how WeSQL achieves significant storage cost savings while delivering several times the computational efficiency of AWS RDS. By combining low cost with strong performance, WeSQL challenges the traditional expectations of OLTP databases with S3-based storage.

Test Explanations

Sysbench is a widely used tool for testing OLTP performance, but its final metrics can vary greatly depending on factors like instance specifications, data volume, concurrency levels, and access patterns.

In the following Sysbench tests, we have designed the testing scenarios to closely resemble real-world business use scenarios.

Instance size

We chose the 4-core 16GB specification because This specification is widely used for small to medium-sized database instances in production, making the test results more relevant to typical workloads. And the 4-core 16GB setup provides a well-balanced combination of CPU and memory, making it suitable for handling most OLTP workloads efficiently without over-provisioning resources.

Data volume and Random type

In typical online database environments, the total data scale usually ranges from 200GB to 300GB,but only a portion of this data is actively accessed. Following the "80/20 rule," the actively accessed data typically amounts to 40GB to 60GB. To simulate real-world business scenarios, we chose a test data volume of 48GB (100 tables with 2 million rows each), which falls within this active data range. The data is accessed using a uniform pattern to ensure all parts of the dataset are evenly accessed, reflecting common usage patterns. This setup creates a realistic test environment for accurate performance evaluation.

With 16GB of memory available, the 48GB data volume exceeds the memory capacity by a large margin. This forces the system to rely on disk-based operations, effectively testing the storage engine’s performance in areas such as I/O efficiency and caching strategies.

Test Environment

Compute Instances:
- AWS RDS Single-AZ:
  - Instance type: db.m5d.xlarge (4vCPU, 16GB RAM)
  - Equipped with a 150GB local NVMe SSD for temporary storage. Persistent storage relies on EBS volumes.
- WeSQL EC2:
  - Instance type: m5d.xlarge (4vCPU, 16GB RAM)
  - Also equipped with a 150GB local NVMe SSD, which WeSQL uses for caching to optimize read & update performance.
Storage Backend:
- AWS RDS:
  - EBS gp3 volumes (200GB, 125MB/s, 3000 IOPS) for persistent storage.
- WeSQL:
  - EBS gp3 volumes (100GB, 125MB/s, 3000 IOPS) for logs and metadata.
  - Primary data storage is offloaded to S3.

WeSQL is designed to minimize dependency on expensive EBS storage by leveraging S3 for data storage, so it uses a small EBS volume to store logs and metadata.

Client:
- Sysbench 1.0.20
- EC2: t3a.2xlarge (8vCPU, 32GB RAM)
Server:
- Database Version:
  - AWS RDS: 8.0.35
  - WeSQL: Built on MySQL 8.0.35
- Deployment:
  - Both systems were deployed as single-node instances for a direct performance comparison.
Network Configuration:
- Availability Zone: All components—including AWS RDS, WeSQL EC2 instances, and the Sysbench client—were deployed within the same AWS availability zone to reduce network latency and ensure consistent test conditions.

Test Method

We used the Sysbench oltp_read_write workload to evaluate performance. The test configuration was as follows:

DataSet: Prepared 100 tables, each containing 2 million rows of data.
Concurrency: Tests were conducted with concurrency levels of 2, 4, 8, 16, 32, 64, 96, and 128.
Duration: Each concurrency level ran for 300 seconds.
Interval: A 60-second interval was applied before starting the next concurrency level.

Results

We tested both AWS RDS and WeSQL under the oltp_read_write workload using varying levels of concurrency.

throughput comparison

images/wesql_rds_throughput.png

95th Percentile Latency comparison

images/wesql_rds_rt.png

Conclusions

Performance perspective

Based on the test results, WeSQL demonstrates peak performance that is nearly 4 times higher than AWS RDS using the same resources. Additionally, WeSQL provides significantly better latency compared to AWS RDS.

In real-world business scenarios, low latency is often critical. For instance, under a 32-thread load, WeSQL achieves a QPS of 7356.72, with a P95 latency of 110.66ms. In contrast, at the same latency level, AWS RDS achieves only 1232.69 QPS. This means that WeSQL has approximately 6 times the throughput of AWS RDS when comparing performance at equivalent latency thresholds.

Cost perspective

WeSQL also provides a significant storage cost advantage. In this test scenario, where the overall data volume is relatively small, our costs are still nearly half of AWS RDS Single-AZ. As data volume grows, this cost advantage becomes even more pronounced.

AWS RDS Single-AZ (db.m5d.xlarge): USD 0.419 per hour
AWS RDS Multi-AZ (db.m5d.xlarge): USD 0.838 per hour
AWS EC2 (m5d.xlarge): USD 0.226 per hour
Above prices are based on the us-east-1 availability zone.

Although we used a single-node deployment in our test, real-world environments typically require cross-AZ disaster recovery for resilience and fault tolerance. In WeSQL’s architecture, data durability is ensured by continuously uploading data to S3, which inherently provides cross-AZ disaster recovery capabilities. As a result, a single-data-node WeSQL deployment offers cross-AZ disaster recovery capabilities, while costing nearly 1/4 of AWS RDS Multi-AZ.

To ensure no data is lost during an AZ failure, including logs stored on EBS, WeSQL’s multi-AZ deployment adds two additional log nodes. In upcoming articles, we will provide a detailed analysis of the cost and performance differences between WeSQL and AWS RDS Multi-AZ.

Analysis

By separating the read and write QPS from the above tests for comparison, it is clear that WeSQL delivers superior performance over AWS RDS in both read and write operations.

images/qps_write.png

images/qps_read.png

Why WeSQL outperforms AWS RDS in write performance

SmartEngine's Write-Optimized LSM-Tree Data Structure

The storage engine used by WeSQL, SmartEngine, is built on an LSM-Tree architecture.This design minimizes write amplification compared to the B+ Tree structure used by InnoDB in AWS RDS,resulting in more efficient write operations and better overall write performance.
S3's High Write Bandwidth Beats EBS GP3

SmartEngine uses S3 as the persistent storage layer, taking advantage of its higher bandwidth to accelerate flush and compaction operations.In contrast, AWS RDS relies on gp3 volumes for persistent storage,where the limited bandwidth of gp3 can become a bottleneck during dirty page flushing.This leads to I/O constraints that hinder write performance in RDS.

Why WeSQL outperforms AWS RDS in read performance

Low-Latency Local NVMe SSDs Cache

SmartEngine makes use of local NVMe SSDs as a read cache, which provides several key advantages:
- Separation of Read and Write I/O: By isolating reads from writes, WeSQL reduces I/O contention, resulting in smoother and faster read operations.
- Higher Performance of NVMe SSDs: Local NVMe SSDs offer significantly better performance compared to the gp3 volumes used by AWS RDS, enabling faster data access and lower read latencies.
Optimizations for LSM-Tree’s Read Challenges

While the LSM-Tree architecture traditionally underperforms B+ Tree in read-heavy workloads, SmartEngine incorporates a series of optimizations to bridge this gap and achieve read performance comparable to InnoDB. These include:
- Multi-priority caching mechanisms to prioritize hot data.
- Bloom filters to minimize unnecessary disk reads.
- Asynchronous I/O for better concurrency and throughput.
- Latch-free metadata operations for lower contention and higher throughput.

Run Database in GitHub Actions, Persisting Data to S3, and Access it Publicly

December 12, 2024 · 7 min read

earayu

WeScale Contributor

When it comes to running a database for development, testing, demos, or short-lived workloads, cost and complexity can be big headaches. Traditional database hosting—like a VPS, a cloud VM, or a managed service—comes with ongoing costs, storage expenses, and configuration overhead. You often end up paying for resources even when you're not using them.

But what if you could spin up a database only when you need it, leverage cheap (or free) object storage for persistence, and tear it down afterward at nearly zero cost, without losing data? One possible approach is to use GitHub Actions as an ephemeral compute environment, combined with S3 (or an S3-compatible service) as persistent storage. With a secure tunnel, you can even access it publicly.

Important Note on Usage: This approach should be used only for short-term integration tests, temporary demos, or quick development tasks. Do not abuse GitHub Actions by running a database continuously or using it as a long-term service platform. GitHub Actions is designed primarily for CI/CD, not as a free computing resource for persistent services. If you need continuous or long-running database hosting, please consider other services or set up a Self-Hosted GitHub Runner in an environment you control, ensuring compliance with GitHub’s usage policies.

Key Idea

The core concept is:

Ephemeral Compute from GitHub Actions: Spin up a MySQL-compatible database only on-demand as part of a CI/CD or testing workflow.
S3-Compatible Storage for Persistence: Store all database data in object storage like AWS S3 or Cloudflare R2. When the ephemeral environment ends, you still have your data safely stored off-runner.
Tunneling for Public Access: Temporarily expose the database to the internet for tests or demos.
Short-Term Use Only: The database runs during the workflow execution window. Once done, the workflow ends and the ephemeral compute resource is freed. This is not intended as a permanent hosting solution.

Want it entirely free? Consider an S3-compatible service like Cloudflare R2 with a generous free tier.

Use Cases

Integration Testing in CI/CD: Spin up a real MySQL-compatible environment for test runs, then shut it down.
Temporary Demos: Need a quick, shareable DB instance for a one-off demo? Perfect.
Short-Term Development: Quickly test new code against a real DB environment without maintaining a full-time service.

Not Recommended For:

Long-term database hosting or production workloads.
Maintaining an always-on public database endpoint.
Circumventing GitHub Actions usage policies.

If you have longer-running needs, consider setting up a Self-Hosted Runner where you manage resources and adhere to the appropriate usage rules.

Example GitHub Actions Workflow

Below is an example GitHub Actions workflow that demonstrates this approach. The workflow briefly spins up a WeSQL database, uses object storage for persistence, and provides temporary tunnel access. If you want to use this workflow, you can follow the steps in the README.

name: Start WeSQL Cluster

on:
  workflow_dispatch:

jobs:
  build:
    runs-on: ubuntu-latest

    steps:
      - name: Configure AWS CLI
        run: |
          aws configure set aws_access_key_id ${{ secrets.WESQL_OBJECTSTORE_ACCESS_KEY }}
          aws configure set aws_secret_access_key ${{ secrets.WESQL_OBJECTSTORE_SECRET_KEY }}
          aws configure set default.region ${{ secrets.WESQL_OBJECTSTORE_REGION }}

      - name: Start WeSQL Server
        run: |
          export WESQL_OBJECTSTORE_BUCKET=${{ secrets.WESQL_OBJECTSTORE_BUCKET }}
          export WESQL_OBJECTSTORE_REGION=${{ secrets.WESQL_OBJECTSTORE_REGION }}
          export WESQL_OBJECTSTORE_ACCESS_KEY=${{ secrets.WESQL_OBJECTSTORE_ACCESS_KEY }}
          export WESQL_OBJECTSTORE_SECRET_KEY=${{ secrets.WESQL_OBJECTSTORE_SECRET_KEY }}

          docker run -itd --network host --name wesql-server \
            -p 3306:3306 \
            -e MYSQL_CUSTOM_CONFIG="[mysqld]\n\
            port=3306\n\
            log-bin=binlog\n\
            gtid_mode=ON\n\
            enforce_gtid_consistency=ON\n\
            log_slave_updates=ON\n\
            binlog_format=ROW\n\
            objectstore_provider='aws'\n\
            repo_objectstore_id='tutorial'\n\
            objectstore_bucket='${WESQL_OBJECTSTORE_BUCKET}'\n\
            objectstore_region='${WESQL_OBJECTSTORE_REGION}'\n\
            branch_objectstore_id='main'" \
            -v ~/wesql-local-dir:/data/mysql \
            -e WESQL_CLUSTER_MEMBER='127.0.0.1:13306' \
            -e MYSQL_ROOT_PASSWORD=${{ secrets.WESQL_ROOT_PASSWORD }} \
            -e WESQL_OBJECTSTORE_ACCESS_KEY=${WESQL_OBJECTSTORE_ACCESS_KEY} \
            -e WESQL_OBJECTSTORE_SECRET_KEY=${WESQL_OBJECTSTORE_SECRET_KEY} \
            apecloud/wesql-server:8.0.35-0.1.0_beta4.38

      - name: Wait for MySQL port
        run: |
          for i in {1..60}; do
            if nc -z localhost 3306; then
              echo "MySQL port 3306 is ready!"
              exit 0
            fi
            echo "Waiting for MySQL port 3306..."
            sleep 5
          done
          echo "Timeout waiting for MySQL port 3306"
          exit 1

      - name: Start and parse Serveo tunnel
        run: |
          # Just a neat trick: start a tunnel and parse out the assigned port
          nohup ssh -o StrictHostKeyChecking=no -R 0:localhost:3306 serveo.net > serveo.log 2>&1 &
          sleep 5

          TUNNEL_LINE=$(grep 'Forwarding TCP' serveo.log || true)
          if [ -z "$TUNNEL_LINE" ]; then
            echo "No forwarding line found"
            exit 1
          fi

          HOST="serveo.net"
          PORT=$(echo "$TUNNEL_LINE" | sed 's/.*Forwarding TCP connect from .*:\([0-9]*\)/\1/')

          echo "MySQL Public Access:"
          echo "Host: $HOST"
          echo "Port: $PORT"
          echo "Connect: mysql -h $HOST -P $PORT -u root -p${{ secrets.WESQL_ROOT_PASSWORD }}"

          echo "HOST=$HOST" >> $GITHUB_ENV
          echo "PORT=$PORT" >> $GITHUB_ENV

      - name: Write Connection Info to S3
        run: |
          # Just a convenience: store connection info in S3 so you can find it later
          cat << EOF > connection_info.txt
          host=$HOST
          port=$PORT
          username=root
          password=${{ secrets.WESQL_ROOT_PASSWORD }}
          mysql_cli="mysql -h $HOST -P $PORT -u root -p${{ secrets.WESQL_ROOT_PASSWORD }}"
          EOF

          aws s3 cp connection_info.txt s3://${{ secrets.WESQL_OBJECTSTORE_BUCKET }}/connection_info.txt
          echo "Connection info is now in s3://${{ secrets.WESQL_OBJECTSTORE_BUCKET }}/connection_info.txt"

      - name: Keep session running
        run: |
          # Keep the workflow alive so the database stays accessible.
          echo "Press Ctrl+C or cancel the workflow when done."
          tail -f /dev/null

Breakdown of the Workflow (Tips & Tricks)

Using S3 or R2: By default, this example shows AWS S3. But since WeSQL just needs an S3-compatible API, you could easily switch to Cloudflare R2’s free tier. That could make the whole setup cost zero, no matter how often you run it.
Tunneling: Serveo is a neat trick. By using SSH, we ask Serveo to forward a random high port on their server to our MySQL port on the ephemeral runner. Suddenly, anyone on the internet can connect to our ephemeral DB—cool, right? You can replace Serveo with other tunneling services such as ngrok as you like, but Serveo is simple and free.
Persistence: Because all data lives in S3 (or R2), the DB can vanish and reappear as needed with full persistence. Think of it as "serverless" MySQL.
Storing Connection Info in S3: Another trick. Instead of always reading logs, you can fetch the connection details from S3. This is handy if you want to programmatically retrieve connection info for some automation elsewhere.

Connecting to the Database

After this workflow runs, check the Actions log for the Host and Port. On your local machine, just do:

mysql -h serveo.net -P <PORT> -u root -p<YOUR_PASSWORD>

images/use-github-actions-as-a-database/actions_run_page.png

Data Persistence & Restarting

The big win here: When the runner dies, you lose the container, but not the data. Next time you run the workflow, WeSQL pulls data back from your chosen object storage provider (AWS S3, R2, or any S3-compatible service). This means you can treat your DB like a long-lived environment—even though it’s technically ephemeral compute.

You can see all your data in S3 bucket: images/use-github-actions-as-a-database/s3_page.png

A Note on Security

The tunnel is public. Use a strong password, maybe restrict access (if the service supports it), and rotate credentials when done.
Keep AWS and DB credentials in GitHub Secrets, not in your code.
If you handle sensitive data, consider TLS and additional security layers.

Conclusion

This approach challenges the traditional “rent a server and keep it running” model for databases. Using GitHub Actions as ephemeral compute plus S3 for storage, you get a “serverless-ish” database that runs only when you want it, free of charge, and is still accessible from anywhere on the net.

It’s a neat hack. Give it a try next time you need a throwaway or on-demand DB environment. You might never go back to your old ways of spinning up costly VMs or managed instances just to show someone a quick demo or run a batch of tests.

What We’ll Build​

Prerequisites​

Networking Requirements​

Create S3 Bucket​

Create IAM Roles​

Create Log Group​

Create ECS Cluster​

Create ECS Task Definition​

Create ECS Service​

Connect to the Database​

Pause and Resume the Database​

Other Considerations​

Automation with Python Scripts​

Test Explanations

Instance size​

Data volume and Random type​

Test Environment​

Test Method​