Innovation + Commitment = The OptumSoft Way
At Optumsoft, it’s not enough to innovate – we also pride ourselves on building world-class products and providing a wide range of support to all of our customers. We know that in today’s economy organizations need to know that they are not only buying software for today, but are making an investment for the future. That’s why we constantly come up with new solutions, release new versions of our products (including our flagship OptumSoft TACC®), and work closely with our customers and partners to create new features.
We also take pride in never abandoning our clients, whether it’s addressing a technical issue or supporting legacy products. It’s why OptumSoft is a trusted partner that has developed a reputation for excellence not only in the lab, but in the field.
TACC Overview
Introduction
OptumSoft TACC includes a schema language, a compiler, an object database, and runtime environment to empower organizations to efficiently create resilient high-availability (HA), high-performance embedded and distributed applications for Cloud and Telecom Service Providers, and in cloud-networked systems for applications such as wireless network equipment and Internet of Things (IoT) appliances.
OptumSoft’s customers are using TACC in production systems, from network equipment installed worldwide to a massive multiplayer online gaming back-end distributed High Availability (HA) key-value database.
TACC enables these customers to rapidly develop, deploy and evolve carrier-grade code by allowing developers to:
- focus on business logic, not inter-process communication and data management
- implement robust systems with modular upgradability, fault isolation, clustered scalability and HA
- write dramatically less code
Developers can use TACC without having specialized expertise in HA or distributed systems.
OptumSoft’s TACC products help customers write modular software that incorporates advanced object database technologies with less complexity than alternate approaches. TACC also provides built-in support for HA with automatic failover, high scalability through dynamic expansion and contraction of databases across clustered servers, in-service software upgrade, and user-extensible application modules. In our experience, TACC achieves these objectives with 90% fewer lines of code than implementing these features by hand.
TACC incorporates a schema language that is easy for programmers with database experience to learn and use. Using TACC’s development workflow, the TACC compiler combines the developer’s business logic with a schema text file and generates C++ code that includes runtime code generated inline, and shared library calls.
TACC leverages various open source technologies such as Linux, and makes it easy to produce code that can be integrated with open source software such as a database, or connected to deployment frameworks such as OpenStack.
TACC for Telecom Carriers, Service Providers and IoT
Telecom Carriers, mobile network operators, are going to experience an increase in the number of devices and service complexity, driven by mobile device growth, and emerging Internet of Things (IoT) devices and services. The carriers and their equipment suppliers have made large investments in provisioning, orchestration and service creation systems that are running into scalability problems, and do not meet providers’ agility and elasticity requirements. In many cases, the workflow processing and user interface components are not the limiting factors. Often the backend databases cannot handle the volume. TACC was designed to address the scalability bottlenecks characteristic of traditional client-server databases (even the new generation of NoSQL databases), and to do so while providing a dramatically simplified programming model that supports distributed computing at massive scale.
Developers can use TACC to implement a fault-tolerant, dynamically scalable (up and down), object database that is deployable in stages, addressing customers who need to migrate seamlessly to new technology. The result would be expanded capacity and agility, while protecting the carrier’s investment in their existing systems.
TACC is also useable with new platforms, such as OpenDaylight, for Software-Defined Networking (SDN) and Network Function Virtualization (NFV). For example, YANG models can be easily translated to TACC schemas. While these initiatives have fostered a variety of industry projects producing software, in several cases, implementation of a HA distributed persistent store is targeted for the future. Using TACC, vendors and service providers can fulfill this requirement much more rapidly, and with lower risk, than by using other approaches.
TACC for Cloud and Big Data Applications
In the cloud applications world, customers are demanding capabilities such as NoSQL clustered databases for Big Data, 365 day/24 hour operation, and the ability to respond to changes in resource demand, both up and down. Most Big Data databases do not have the features necessary for real-time transaction processing. TACC has features that ensure distributed data consistency, and where required, support for ACID transactions. TACC empowers developers to write applications using the same database for both Big Data Analytics, and real-time transaction processing.
Systems developers who have chosen off-the-shelf components such as an open source NoSQL database and Zookeeper, an open source program providing distributed configuration and synchronization, have run into challenges. Several of the largest cloud companies and commercial vendors are investing significant resources to modify these tools. Anecdotal data indicates some have invested 100+ person-years. This may be OK for the top few companies have the resources to make their flagship service work, but this approach is not feasible for most service providers and enterprises. TACC enables developers to take advantage of a comprehensive suite of proven capabilities, designed and tested to work together.
In most cases, TACC can mitigate issues with consistency, sharing or failover by either replacing the existing database or as an in-memory cache in the current system.
Numerous companies using Big Data databases have experienced various issues as workloads have increased. Some have been very difficult if not impossible for them to resolve. Examples are failover that does not always work correctly, system hangs, scaling problems where sharding does not work quite right and erratic performance. The result is that programmers have to wade into source code they do not understand, or post their problems on message boards to find workarounds that may address their specific problem, and with luck will not break something else. TACC addresses this by providing an integrated suite with proven HA and dynamic scalability.
Many issues are an unfortunate by-product of integrating multiple open source projects. Despite many open source databases, no single package provides the combined capabilities of database clustering, HA and consistency assurance. Zookeeper requires significant DevOps resources to make it work with other components. Zookeeper also typically requires error-prone manual intervention when failures occur or reconfiguration is necessary. With TACC, failover and leader election are both part of our integrated framework and provide automated consistency, dynamic sharding and failover without needing an operator to perform manual tasks.
TACC for Embedded Systems and IoT Devices
Many vendors are facing requirements to add cloud connectivity or more customizability to their current systems. In most cases, TACC can provide these enhancements to existing systems without major changes to the current software.
Embedded systems devices today typically have multicore CPUs, 512MB or more of main memory, and Linux has become commonplace. Users expect continuous operation, which is driving requirements for fault isolation and the ability to upgrade without rebooting the system. TACC provides features to address these requirements, to take full advantage of the hardware, and more.
Adoption of new sensor and control devices for Internet of Things applications is driving new device types and new software requirements. Equipment and systems vendors are looking for new options in order to monitor and control rapidly proliferating devices. While there are major initiatives around standards for messaging, i.e. data plane protocols such as MQTT, XMPP and others, the management and control planes in many cases fall short in areas such as the ability to recover from softer glitches without rebooting the device, and in-service upgradability. Device manufacturers can use TACC control and management protocol alongside industry standard messaging protocols to provide stateful restart on a per-process basis, and remote software upgradability without specialized hardware support. With TACC, systems designers have the flexibility to place controllers, with HA, in the cloud, in on premises devices, or even distributed among the edge network. Further, TACC’s small memory footprint makes it feasible for use in very small systems, which up to now have been limited in capability.
For many systems vendors, long software test cycles are hindering feature enhancement, and response time to fix bugs. The causes are a combination of complex multi-threaded process modules, which have to be extensively stress-tested to weed out problems that occur with multiple concurrent tasks. TACC can reduce cost, complexity and duration of QA cycles, enabling programmers to write small single-threaded processes that can be unit tested and that work together without the overhead of IPC and RPC mechanisms of other systems.
TACC Features
TACC comprises a schema language for describing distributed object types, a compiler that generates C++ code, and run-time components that include inline code generated by the compiler. TACC also has optional cluster and service management components. OptumSoft actually uses TACC to develop and evolve all of these components.
The TACC compiler simplifies development by generating optimized code and function calls to perform runtime tasks, including memory management, communications among processes (IPC), database access and failure handling. As a result, systems developed using TACC have a smaller codebase.
From the programmer’s perspective, the TACC database looks like a virtual object store containing all the entities, without concern for where they are physically stored. Programmers do not need to be concerned with memory constraints, whether or not the database is sharded, or fault tolerance.
Using TACC, developers write business logic in small, isolated, single-function processes. With no user-managed inter-process communications (IPC) code intermingled with business logic, applications are simpler to write, debug, and evolve.
Multiple applications processes share information by referencing the same objects, and each process treats objects it accesses as local data.
Application processes are essentially stateless. In the case of a restart, a process retrieves state from its corresponding object store extremely quickly, typically in less than 10 milliseconds. Applications processes might hang or crash because of programming errors, but they can be restarted without taking down the whole system, and applications modules can be added and upgraded on the fly.
Because all references (even writes) access local copies of objects, user code does not need to handle failures or delays caused by remote communication mechanisms. Compared to other approaches that use remote procedure calls (RPC), the performance of TACC is typically significantly higher, with lower and more predictable latency and greater state consistency.
Each TACC agent process is single-threaded and does not require any concurrency control mechanisms (such as locking), which further simplifies development and debugging.
How to Use TACC
To build a TACC application, first, determine the data flow and the data objects that the application is going to access and modify. Like other databases, you create a schema definition, which in TACC is a text file, describing one or more groups of objects, similar to the way you would with a more traditional database. The TACC compiler uses the schema definition to create an in-memory database.
Once you have created the application business logic, the TACC compiler reads the application source code containing the TACC-unique constructs and generates library calls and inline code to handle state management. The TACC compiler generates C++ output in .h and .cpp files that include inline code and calls to shared libraries. The standard GCC (GNU Compiler Collection) compiler then compiles the TACC output, producing .so (shared library) files.
For Python, the interpreter dynamically loads the shared libraries, and the TACC data types are accessed and manipulated using standard Python code.
The TACC framework and runtime include optional features that do not require any special application code. These include dynamic database sharding for large datasets and high IO bandwidth; HA support using primary and secondary database components with automatic failover, leader election, and disk persistence.
How TACC Works
Physically, the object store consists of one or more single threaded processes that operate as in-memory databases (“sysdbs”). As illustrated in the first diagram, depending on how you configure and install the system components, sysdbs may reside on a single CPU device, or be distributed, for example, across processors on line cards in a telecom equipment chassis, or a virtual machine cluster, or combined with multiple cloud services spanning the globe.
Application processes (“agents”) access the entities by mounting local copies of objects in the virtual object store. The concept of “mount” is akin to a file system mount in Linux, and is invoked by “mount” and “unmount” operations in TACC. The mount operation has options such as whether or not the application process has read or write access to the object. As shown in the second diagram, this allows an application process to mount individual objects, or groups of objects that are actually located in different places from the same physical device as the applications process, or as far away as the virtual memory of a virtual machine in a cloud data center using the identical application code.
For example, in a temperature sampling and display system, both a temperature-sampling agent and a temperature-display agent could mount instances of the object named Thermometer, which contains temperature values. In a banking application, both the agent that keeps track of the current account balance and the agent that displays the value would mount the object named CurrentBalance.
When the agent writes the new value to its local copy of the temperature reading, or the bank account balance, TACC is responsible for propagating this local change both to the master Thermometer or CurrentBalance object and to other agents. Depending on how you wrote the program, TACC can automatically notify the agent responsible for displaying the value so that agent can update its display.
Summary
TACC provides a framework for developers that is flexible, designed for high productivity and relieves programmers from writing tedious and error-prone code associated with distributed scale and HA. It is efficient, with a small footprint suitable for embedded systems, yet scales to very large cloud requirements, with service provider-class robustness.
By following the programming model of small simple single-threaded applications processes, your development team can achieve high productivity, shorter software testing cycles, and provide bug fixes and new features to customers more quickly than alternatives.