/ blog

: Aaron Sparks | : 2019-12-28 | : databases, postgresql, sql server, azure, sql

Screenshot of one of the HTML/Jinja templates for this website. (Photo credit - Aaron Sparks)

I've been working on a consulting project where we're trying to use Microsoft products where it makes sense. I originally prototyped a few things in Postgres and then found myself in need of converting it over to SQL Server. After some quick googling, I discovered that most of the posts/Stack Overflow content on this topic was about going from SQL Server to Postgres, not the other way around! I did manage to find one Stack Overflow post with some helpful hints. Since the hints were all at a fairly high-level, I decided to write this post and provide some of the code I used.

Note: All of the code below was developed and tested on a Mac using ZSH (not bash). I don't think there's a difference between simple bash and zsh sed commands, so it should work fine.

Schema conversion
The first step in the conversion process is re-creating the schema (i.e. table and field names, types). In terminal, run pg_dump on your Postgres database to get the current schema.

pg_dump --schema-only --no-owner [your-database-name] > schema_create_script.sql

SQL Server doesn't like some of the statements and syntax used by Postgres, so we're going to do some processing to strip that out before creating the new schema in SQL Server. I'll walk you through the approach and then display the code. First, we're going to add BEGIN TRANSACTION (at the start) and ROLLBACK TRANSACTION (at the end) to the script. This will allow you to make a few test runs and figure out if you have any errors. At the end, once you've addressed all of the errors, you will commment out the ROLLBACK TRANSACTION and insert a COMMIT TRANSACTION at the end. In your favorite editor (note: I've been using Azure Data Studio), run the schema_create_script.sql and see what errors you get.

Next, let's go over the statement and syntax issues you will encounter:

  • SET: Remove the SET statements from the top of the script. You can do this manually or use the sed commands below.
  • Schema: You will also want to remove any instance of Postgres' ownership schema naming (i.e. "public.").
  • Primary keys: Postgres and SQL Server treat the primary keys/unique identifiers in a table differently. Postgres can use a bigint field type with a sequence to auto-increment. This needs to be replaced with something like [field name] int IDENTITY PRIMARY KEY.
  • Postgres uses boolean data types (true/false). SQL Server uses BIT (1/0).

And lastly, here are a few quick sed commands that can help with the schema conversion if wanted. In terminal, run the following:

sed -i '' '/^SET/ d' schema_create_script.sql
sed -i '' 's/public\.//g' schema_create_script.sql

Data migration
The next step is to migrate the data from Postgres over to SQL Server. In terminal, run pg_dump on Postgres to get the data.

pg_dump --data-only --column-inserts --table '[your tablename]' -d [your-database-name] > data_load_script.sql

Similar to the schema conversion, you're going to want to do some processing around the SET statements, schema naming, and other Postgres-specific statements/syntax.

sed -i '' 's/public\.//g' data_load_script.sql
sed -i '' '/^SET/ d' data_load_script.sql
sed -i '' '/^SELECT pg_catalog/ d' data_load_script.sql
sed -i '' '1s/^/SET IDENTITY_INSERT dbo."[your tablename" ON\; /' data_load_script.sql

Lastly, I rebuilt my views from scratch rather than try to convert them. A lot of the casting from one type to another is very different. The nested casting and conversion formulas got too complex.

The biggest challenge was converting the integer-based datetime values to a date. The project I'm working on has a CRM system where the API sends the raw integers (vs a datetime). Postgres has a nice cast for this:

to_timestamp(("dateAdded"::numeric / 1000::numeric)::double precision) AS datetime_added

When I went to re-build the equivalent of this in SQL Server, I discovered that there are some dates that cause an arithmetic overflow using a similar approach:

DATEADD(s, CONVERT(bigint, dateAdded) / 1000, CONVERT(datetime, '1-1-1970 00:00:00')) AS datetime_added

So, after a lot of googling, I discovered an alternative that allowed all of the dates to convert without any overflow issues.

CAST(dateAdded / 86400000.0 + 25567 AS DATETIME) AS datetime_added

Like the blog post? Spread the word!

about aaron

Here's a brief rundown on the author of this blog and contact info if you want to reach out.