AWS RDS Aurora PostgreSQL Query Export to S3 Stub
February 2, 2024
AWS RDS and Aurora database services have a very useful enhancement, aws_s3.query_export_to_s3
, to the native PostgreSQL COPY command, that lets you export from RDS/Aurora directly to AWS S3 storage. This can be very useful for moving data around, including exporting from Production with masking functions to land in test/debugging environments.
There are currently some limitations around cross-account exporting, like you can’t. But simple workarounds to post-process the files to either rekey them, or copy them yourself exist.
Your local machines won’t have this AWS specific procedure but or local testing you can write a stub and install into your local PostgreSQL database that gives a good approximation of what the AWS procedure will do.
Note: dollar-quoting is highly recommended for wrapping around the select statment. Will make your life much easier if you are programmatically generating these statements and export commands.
Update See related aws_s3.table_import_from_s3
stub.
CREATE SCHEMA IF NOT EXISTS aws_s3;
--
-- For local testing
-- stmt: The select statement to be exported
-- bucket: S3 bucket name, ignored
-- path: Full absolute path to the desired output file
-- region: S3 region, ignored
-- opt: COPY options
-- returns exported rows, files (always 1), and bytes (faked)
--
CREATE OR REPLACE
FUNCTION aws_s3.query_export_to_s3(
stmt TEXT, bucket TEXT, path TEXT,
region TEXT, opt TEXT,
OUT rows bigint, OUT files bigint, OUT bytes bigint)
AS
$$
DECLARE copy_stmt TEXT;
BEGIN
copy_stmt := 'COPY (' || stmt ||
') TO ''' || path ||
''' WITH (' || opt || ');';
EXECUTE copy_stmt;
GET DIAGNOSTICS rows = ROW_COUNT;
IF rows > 0 THEN
files := 1;
bytes := rows * 1024;
ELSE
files := 0;
bytes := 0;
END IF;
END;
$$
LANGUAGE plpgsql;