Replication over the Internet or across non-trusted domains is usually performed using a virtual private network (VPN), and consequently the configuration is much the same as that used on a LAN. This article outlines what to do if such a VPN is not available. While it’s true that in almost all cases you wouldn’t consider making SQL Server available directly to the Internet, loosely-coupled networks in a hosted environment are commonplace. These systems ensure integrity by the use of firewall rules, resulting in DMZ/ECZ layers and causing serious issues for the would-be replicator. This is very much a ‘how-to’ guide, as the intricacies of network connectivity setup would otherwise make the article too long.
In replicating over non-trusted networks, the main additional considerations over and above a normal setup are:
When configuring the subscriber, the publisher will not initially be visible in Enterprise Manager, and cannot be added in the usual way.
After the snapshot has been created, it is normally accessed by the merge/distribution agent using the repldata share, which will be unavailable over the Internet.
The replication agents normally run under the context of a windows user account and are impersonated at the distributor/publisher; however an AD lookup at the publisher when accessing the snapshot files will not be able to find the windows user.
What follows is a list of the configuration changes necessary to solve these issues.
SQL communication needs to be enabled on the port defined at the publisher. For a default instance, this is typically port 1433, while FTP uses port 21 by default. So, rules allowing the subscriber’s IP address access on these ports through the firewall need to be set up, implying that the subscriber must not get a leased DHCP IP address, but have a fixed address of its own.
In SQL Server 2005, merge replication is possible over port 443 (SSL/HTTP) — more on that in another article — but for our case, the security is provided by the use of logins and passwords, firewall rules, and encryption if necessary.
The repldata share on the publisher that is normally used to hold the snapshot initialization files will be unavailable — discounting the use of a guest windows login at the publisher which is simply too insecure a method to consider — so an alternative initialization method is needed. The main choices are automatic FTP and alternative snapshot locations at the subscriber. For the latter, when a pull subscription is created, it can be pointed at a local directory for the snapshot files. I often use this method as my snapshot files are so large (>40GB) that I save a lot of time doing the whole thing manually. Basically, this involves the creation of the snapshot, compressing the directory using WinZip 9.0, manually FTPing to the subscriber then unziping locally. After that I use the alternative snapshot location (@alt_snapshot_folder) on the subscriber to initialize. This method is used because although compression using CAB files is available within the publication wizard, it is limited to a maximum size of 2GB.
This article however will concentrate on the inbuilt, automatic method of FTP which is configured at the publisher on the snapshot location tab. If all the subscribers need to be set up in the same way (non-trusted domains etc) then on this tab only the second checkbox is selected at the top. The snapshot folder lies under the FTPROOT directory and ‘Subscribers can access this folder using FTP’ obviously needs to be selected. If you do choose to compress, the files will be uncompressed using the Temp folder of the subscriber, so you must check that you have enough space. The client path is the path from the FTP root directory, so in this case is TRSnapshotftp. The login can be left as the default for anonymous access (not recommended) or a valid windows login at the publisher is selected that has read access to the FTPROOT directory –- PaulI in this case.