AWS RDS Aurora PostgreSQL Query Export to S3 Stub

February 2, 2024

AWS RDS and Aurora database services have a very useful enhancement, aws_s3.query_export_to_s3, to the native PostgreSQL COPY command, that lets you export from RDS/Aurora directly to AWS S3 storage. This can be very useful for moving data around, including exporting from Production with masking functions to land in test/debugging environments.

There are currently some limitations around cross-account exporting, like you can’t. But simple workarounds to post-process the files to either rekey them, or copy them yourself exist.

Your local machines won’t have this AWS specific procedure but or local testing you can write a stub and install into your local PostgreSQL database that gives a good approximation of what the AWS procedure will do.

Note: dollar-quoting is highly recommended for wrapping around the select statment. Will make your life much easier if you are programmatically generating these statements and export commands.

Update See related aws_s3.table_import_from_s3 stub.

CREATE SCHEMA IF NOT EXISTS aws_s3;
--
-- For local testing
-- stmt: The select statement to be exported
-- bucket: S3 bucket name, ignored
-- path: Full absolute path to the desired output file
-- region: S3 region, ignored
-- opt: COPY options
-- returns exported rows, files (always 1), and bytes (faked)
--
CREATE OR REPLACE
  FUNCTION aws_s3.query_export_to_s3(
     stmt TEXT, bucket TEXT, path TEXT, 
     region TEXT, opt TEXT, 
     OUT rows bigint, OUT files bigint, OUT bytes bigint)
AS
$$
DECLARE copy_stmt TEXT;
BEGIN
  copy_stmt := 'COPY (' || stmt || 
               ') TO ''' || path || 
               ''' WITH (' || opt || ');';
  EXECUTE copy_stmt;
  GET DIAGNOSTICS rows = ROW_COUNT;
  IF rows > 0 THEN
    files := 1;
    bytes := rows * 1024;
  ELSE
    files := 0;
    bytes := 0;
  END IF;
END;
$$
LANGUAGE plpgsql;