Most material available these days cover DevOps from a perspective of source code being written, tested and deployed. The pains and problems however are equally real in the metadata space. Adopting DevOps practices in a metadata-based environment - particularly within the SAS technology stack - has never been tried before, even by the vendor itself. Not until recently.
From applying the principles of software delivery to creating a deployment pipeline using the DevOps toolchain, this post will cover the deployment challenges we've had and the approach to solving them.
The Deployment Dilemma
The SAS ETL team's remit is to develop process flows in SAS Data Integration Studio. This software delivers the capability to consolidate data from a variety of data sources and perform tasks such as extracting, transforming and loading data for use in data warehouses and data marts. These batch jobs are saved in the SAS Metadata Server and promoted to different metadata environments for actual execution and end-to-end testing.
The promotion and deployment activities in SAS are traditionally carried out manually using SAS Data Integration Studio. Using the IDE, an engineer needs to:
- export the object metadata from DEV env into a SAS package format
- promote/deploy the SAS package into the target metadata environment
- kick-off a job deployment process to generate SAS codes
These manual processes were a challenge because, for example in the metadata object promotion process, the engineer has to click several buttons in the IDE and choose the right components (e.g. WebAppDI, WebAppDI_SIT) and libraries that matches the target environment to ensure the job works correctly after deployment. It would take roughly 10 mins per job (depending on its size) to complete the manual deployment.
This is tedious, error-prone and inefficient.
The Solution
The countermeasure to this is leveraging the DevOps toolchain to create a repeatable and reliable deployment pipeline. This requires adopting the principles of automation and keeping release artifacts into version control as prerequisites.
Automation is a prerequisite because it is only through automation that can guarantee that a repeatable and reliable process exists. But this requires the availability of CLI tools to perform tasks outside the IDE - a litmus test when considering automation. Fortunately, the SAS BatchTool exists but it is too cumbersome to use so build scripts were developed in Python and codified within Ansible playbooks to simplify things further.
SAS DI Studio also lacks integration with distributed version control so this quasi-build tool laid down the ground work in embracing another important principle: keeping everything that is needed to build, deploy, test, and release an application in version control.
The corollary of this is, it enabled the team to embrace more automation using the DevOps toolchain. Jenkins Pipeline jobs that exports object metadata, commits them into version control and deploys them to any environment were created. They also include jobs for running automated tests, creating generic release packages and uploading them into binary repository. All of these are being performed at a push (OK, click) of a button in the Jenkins UI.
The Results
By implementing the solution using the DevOps toolchain, the deployment pipeline became an integral part of the development and release management process. With it in place, a consistent, repeatable and reliable deployment process that can be triggered by anyone in the delivery team now exists. This revolutionized code management and deployment procedures for a metadata-based product which was once thought impossible to accomplish.
This also facilitated the adoption of other DevOps toolchains like Ansible for automating operational tasks such as creating SAS user accounts with key-based authentication across the entire server fleet. Having Ansible playbooks written for application deployment and then exercised regularly during releases in the lower environments also resulted in efficiency gains during production deployment. The deployment instructions are short and concise and made deployment tasks more simpler.
Furthermore, a telemetry infrastructure using Elasticsearch, Logstash/Beats, and Grafana was implemented in the lower environment, providing server and application visibility in monitoring infrastructure health.
Overall, by adopting DevOps practices and toolchains, the inefficiencies of manual processes have been minimized and have improved cycle times. This delivered a standardized process and empowered the team to focus more on value added tasks instead of the boring details of manual deployment.