Design and Implementation of a SAP HANA Cloud Configuration Handler

worked on by: Cora Glaß

Outline

Motivation

Not long ago, SAP released its new product the SAP HANA Cloud. In general, SAP HANA is an in-memory database management system that provides its users with fast access, querying and processing. SAP HANA Cloud is SAP's highly scalable cloud solution of SAP HANA and is currently available on AWS and Microsoft Azure. In the following, a system describes the infrastructure on which the SAP HANA Cloud instances (in the following called: HANA instance or instance) are running on. A HANA instance provides the user with many configuration parameters that can be set to satisfy the needs of many customers having different requirements. The user and customer can be the same person. When operating a system with a lot of such highly configurable instances, it is necessary to maintain a general overview on how each instance or group of instances is configured. For example, the configuration of a HANA instance could be changed as a temporary workaround for a bug. Usually a bug would affect multiple HANA instances. Therefore, a group of multiple instances would have to be configured. In this case, we must know which instances belong to this group to be able to execute a roll back of their configurations after the bug was fixed. If the total amount of instances is rather low, they can be managed manually. This also means, a subset of employees must be responsible for the configurations and have to access a productive system. This does introduce the possibility of human errors as well as workload that could have been decreased or replaced by automation. An increasing amount of instances leads to the situation that the configuration handling gets more and more time-consuming and demands a continuous effort as e.g. maintaining an overview of all instances gets more complex. To reduce such manual workload, it can be efficient to use the site reliability engineering (SRE) approach of automating ops related work in form of e.g. a service.

Goal

The goal of my bachelor thesis is to improve the configuration handling of HANA instances by introducing the following functionalities in form of a single service: Secure configuration of instances without accessing the system manually, configuration of multiple instances via one request, on demand generation of a configuration state overview of all HANA instances, creation and application of configuration profiles that contain information of parameters and the value they should get set to.

Thesis Requirements

formulate requirements here (together with your adviser)

Milestones and Planning

A milestone is a scheduled event signifying the completion of a major deliverable or a set of related deliverables. A milestone has zero duration and no effort -- there is no work associated with a milestone. It is a flag in the workplan to signify some other work has completed. Usually a milestone is used as a project checkpoint to validate how the project is progressing and revalidate work. (Source: http://www.mariosalexandrou.com/definition/milestone.asp)

Milestone no. Past days CW Goals target accomplished wrenchSorted ascending
1 DONE 1 CWXX Goals accomplished

Weekly Status

Week - (CW 25)

Activities

  • reading and thinking - building first concepts, creating first architecture drafts, start a documentation (containing open questions, challenges, possible solutions, decisions etc.)
  • clarification of requirements
  • starting to define the scope
  • making the first architectural decisions
  • implementing first draft simple MicroService

Results

Next Steps

  • working on concepts
  • making service available

Problems

  • clarification of scope (mostly resolved in CW 26)

Week - (CW 26)

Activities

  • continuing with the documentation and concepts
  • doing research on how to expose a MicroService (running on a cluster) in a secure way
  • creating architecture (+ relationship/dependency) overviews and thinking about possible challenges

Results

  • MicroService which is now available from outside the cluster (might change to https)

Next Steps

  • introduce configuration request handling

Problems

Week - (CW 27)

Activities

  • continuing with the documentation and concepts
  • introducing simple request handling (request parsing and validation, mini handler for different request types)
  • starting with Unit testing
  • granting the service needed permissions

Results

  • simple request handling
  • cleaner code

Next Steps

  • more Unit testing
  • stricter permissions
  • adding K8s client to communicate with other services/instances/resources
  • (first simple integration test)

Problems

  • (for the current logic) service needs a lot of permissions (hopefully solved in CW28)

Week - (CW 28)

Activities

  • continuing with the documentation and concepts
  • getting Unit tests to run
  • adding and using K8s client
  • fighting the 5 horsemen of LaTeX package dependencies

Results

  • communication between handler and k8s resources
  • working unit tests setup

Next Steps

  • improve profile handling
  • implement instance configuration adapter for HANAs
  • thesis: explain problem, system setup (+ K8s, containerization etc.), related work

Problems

Week - (CW 29)

Activities

  • Adding more unit tests
  • Thinking about the thesis structure
  • Starting to implement the instance configuration adapter for HANAs
  • Splitting functions into API Call sections and pure logic to avoid mocking

Results

  • More unit tests
  • simple adapter

Next Steps

  • starting the thesis writing

Problems

Week 1 (CW 30)

Activities

  • thesis writing (topics: environment)
  • working on configuration adapter for HANAs

Results

  • handler is now able to adjust pod labels

Next Steps

  • add the functionality to adjust parameters within the HANA
  • write thesis (topics: problem, solution)

Problems

Week 2 (CW 31)

Activities

  • writing on thesis topics (environment, problem)

Results

Next Steps

  • add the functionality to adjust parameters within the HANA
  • auditing

Problems

  • internet connection

Week 3 (CW 32)

Activities

  • implementing client for a vault system which holds the permissions to connect to the databases
  • changing approach from entering the containers to executing query from the outside (security yey)
  • configuring the default setting of the landscapes to receive basic permissions for the handler
  • setting up a pipeline for testing (later also for the release process)
  • adding needed driver and ping database successfully

Results

  • client for the vault
  • more secure approach to get permissions to connect to the databases

Next Steps

  • add the actually configuration process as queries
  • auditing
  • add functionality: update configuration of HANAs if profile in use was updated

Problems

  • for the vault client to work I had to change the default settings in our landscapes
  • deleted one work day worth of code (got it back after some time)

Week 4 (CW 33)

Activities

  • switching to the permissions of another abstraction layer of the databases
  • refactoring and improving code (reduced total of API calls in one handling process execution)
  • configuring the default setting of the landscapes to receive basic permissions for the handler again
  • restructure the configuration request form
  • added configuration functionality for HANAs
  • extended configuration feature to handle also a group of HANAs
  • add first (naiv) rollback functionality

Results

  • HANA configuration Adapter
  • improved the performance (reduced API calls)

Next Steps

  • building a better solution for the rollback (currently using memory of the handler)
  • add functionality: retrigger configuration of HANAs in case of an update of a profile in use

Problems

  • realised that I needed the connection permissions for another abstract layer of the database

Week 5 (CW 34)

Activities

  • add functionality: watching events on profiles (update deleted)
  • add functionality: execute update on HANAs if used profile was updated
  • refactoring
  • adjustment: handling requests and watching profiles in parallel
  • created separate project which contains sample profiles (for now)
  • starting to work on the request handling for profile creation and change

Results

  • Update of profiles can trigger the reconfiguration of HANAs

Next Steps

  • Execute rollback if used profile was deleted
  • add feature: Ops can add or adjust profiles temporarily
  • auditing

Problems

  • rollback solution is still very naive

Week 6 (CW 35)

Activities

  • finished the request handling for profile creation and change
  • implementing the deprecation mechanism for profiles that are created via the handler
  • improving the documentation on building and deploying the handler
  • refactoring and adding comments

Results

  • new feature: create/change profiles via handler + deprecation label for created profiles
  • new feature: profiles are deleted based on the deprecation label

Next Steps

  • resolve the last TODOs in the handler
  • start to mainly write on the thesis

Problems

  • my laptop is broken and I could not work for a whole day

Week 7 (CW 36)

Activities

  • improving the structure of the bachelor thesis
  • writing on thesis: section introduction and implementation (features, approaches, concepts etc.)
  • creating architecture overview

From this point onward I will concentrate on writing the thesis. Therefore, the weekly updates are rather short.

Week 8 (CW 37)

Writing the thesis.

Week 9 (CW 38)

on sick leave

Week 10 (CW 39)

This week I worked on the following sections of my bachelor thesis:
  • similar work
  • implementation
  • challenges
  • future work
  • evaluation

Week 11 (CW 40)

Finishing the first draft.

Week 12 (CW 41)

This week will be used to let people proofread the thesis and to improve it based on their feedback.