Sunday, January 25, 2015

Why DevOps: definition and business benefit

Can you just imagine the magic of pushing one button and see your company’s new project materialize in a production environment? The software application code compiled and built, the data center infrastructure (heterogeneous and complex) set up, the application deployed and tested - ensuring compliance - and the business stakeholders given their beloved new campaign ready to use?



I’m working with colleagues and customers to better understand why there’s so much interest around DevOps, what are the business benefit and useful technology.
Here is that state of our reasoning, along with some notes that I collected in my research.

To make the matter easier for people that don’t have experience in managing the software release cycle, I imagined to take a triangle approach: analyse the business drivers that need to be addressed, what is a operational model that could provide the expected results and, finally, what is the enabling technology. With this top-down approach, understanding the concept gradually should be easier for IT professionals that are not expert in the field.



Business Drivers

In every company, Lines Of Business have a dream: have a new solution live in 1 month. It could be a marketing campaign, a new service for their customers, a process to produce new goods.
They think they just need smart developers and the availability of the required infrastructure, that given the spending on IT should not be an obstacle.
Unfortunately, sometimes they feel that IT is not efficient enough. It’s not a matter of technology, but of organization.

Some notes from https://puppetlabs.com/blog/why-every-cfo-should-advocate-devops (Bill Koefoed, the author, is the chief financial officer of Puppet Labs).
IT is the manufacturing of the 21st century. Let’s face it, most products and services these days depend on software, from social media to teleconferencing to household appliances that interact via the internet.
To get ahead of competitors, you have to get your new products and services out fast, test them for customer response, and quickly update to satisfy customer desires. Even as you’re increasing your rate of output, you have to reduce flaws, whether in delivery or the product itself.
That’s why DevOps is so important: The tools, practices and cultural orientation of DevOps enable greater efficiency in IT. Our 2014 State of DevOps report bears this out, both in terms of software throughput and business results. From the standpoint of throughput, we validated last year’s findings that high-performing IT teams (as defined by deployment frequency, lead time for changes and mean time to recover from failure) deploy up to 30 times more frequently than their lower-performing peers, with 50 percent fewer failures. This year, the most provocative finding was the strong connection we found between IT performance and financial performance. Companies with high-performing IT teams got better business results, as they were:
  • 3.3 times more likely to have met or exceeded the company’s productivity goals.

  • 1.6 times more likely to have exceeded company profitability targets.


In this post I will describe DevOps in gereral terms, with some focus on the Business Benefit.
In my next posts I will approach the Operational Model and investigate the technology that can help. Not only tools for Continuous Integration and Continuous Delivery, but also the concept of “infrastructure as code” that allows a flexible and agile use of the infrastructure resources in the same cycle as the software for the applications.


DevOps is a software development method that stresses communication, collaboration (information sharing and web service usage), integration, automation and measurement between software developers and Information Technology (IT) professionals. DevOps is a response to the interdependence of software development and IT operations. It aims to help an organization rapidly produce software products and services and to improve operations performance - quality assurance.

The specific goals of a DevOps approach span the entire delivery pipeline, they include improved deployment frequency, which can lead to faster time to market, lower failure rate of new releases, shortened lead time between fixes, and faster mean time to recovery in the event of a new release crashing or otherwise disabling the current system. Simple processes become increasingly programmable and dynamic, using a DevOps approach, which aims to maximize the predictability, efficiency, security, and maintainability of operational processes. Very often, automation supports this objective.

Development methodologies (such as agile software development) that are adopted in a traditional organization with separate departments for Development, IT Operations and QA, development and deployment activities, previously do not have deep cross-departmental integration with IT support or QA. DevOps promotes a set of processes and methods for thinking about communication and collaboration between departments.



The adoption of DevOps is being driven by factors such as:
  1. Use of agile and other development processes and methodologies
  2. Demand for an increased rate of production releases from application and business unit stakeholders
  3. Wide availability of virtualized and cloud infrastructure from internal and external providers
  4. Increased usage of data center automation and configuration management tools

Use Cases

Automating the release of a complete system can provide advantages in the following situations (a partial list - if you have more reference, please add a comment to this post):
  1. daily builds in the development environment
  2. move the system to the integration environment and to QA
  3. regression testing after a patch
  4. move a system from testing to production
  5. deploy a copy of the system for a new tenant
  6. copy a system to Disaster Recovery
Keep in mind that, thanks to the management of "infrastructure as code", you can have the end to end system managed this way... not only the software code.

Adoption in the world
DevOps is ramping in the US, it seems to be a little late in Europe - as many innovations in the IT.
Companies that benefit already from the introduction of this methodology are:
Google, Amazon, Netflix, Facebook, Twitter, Pinterest, Bank of America, Cisco and more...

Industry headlines tell us every day that companies rise and fall on moments of infectious delight and irritated disappointment. It's not enough to have a great idea and execute on it once. You have to execute, get feedback, refine, and execute again - and again and again. To keep competitors from grabbing a piece of your market, you need to cycle with ever-increasing speed and agility.

Multiple, independently conducted research studies show that, not only are enterprises already adopting DevOps, they are achieving substantial outcomes.
One such study, conducted by independent research organization IDG, shows that enterprises (measured by having more than $500 million in revenues) are adopting DevOps at an even faster rate than smaller businesses. Another study, conducted by independent research firm Vanson Bourne, found that large enterprises are not only adopting DevOps, but more than 90% have seen or expect to see significant benefits, with quantifiable improvements in delivery speed, development and operations costs, defect detection, ability to innovate, and many more, ranging from 17% to 23%. Then there is additional research from InformationWeek, which also shows high rates of adoption and benefits for large enterprises (measured by having 5,000 or more employees).

In next post, I’ll try to define a operational model:
what teams are involved
what processes do you need
what information do you need 
what roles do you need
what skills do you need 

Link to next post: DevOps - Operational Model.

Sources:
1 - https://puppetlabs.com/blog/why-every-cfo-should-advocate-devops (Bill Koefoed, the author, is the chief financial officer of Puppet Labs.)
 


Monday, January 19, 2015

The Elastic Cloud project - Methodology

This posts is the continuation of the post The Elastic Cloud Project - Architecture.
Here I will explain how we worked in the project: the sequence of activities that were required and the basic technologies we adopted.
The concepts are mostly explained by using pictures and screen shots, because an image is often worth 1000 words.
If you are interested in more detail, please add a comment or send me a message: I’ll be glad to provide detailed information.

To begin with, we had to:
  • map the data model of the products used to understand what objects should be created, for a Tenant, in all the layers of the architecture
  • create sequence diagrams to make the interaction clear to all the members of the team - and to the customer
  • understand how the API exposed by Openstack Neutron and from Cisco APIC work, how they are invoked and what results they produce
  • implement workflows in the CPO orchestrator to call the APIC controller and reuse the existing services in Cisco IAC
  • integrate Hyper-V compute nodes in Openstack Nova
  • create a new service in the Service Catalog to order the deployment of our 3 tiers application

Some detail about the activities above:

1 - Map the data model of the products used to understand what objects should be created, for a Tenant, in all the layers of the architecture



know that some of you still don’t know Cisco ACI… I promise that I will post a “ACI for dummies” soon.   :-)


  
This picture shows how concepts in Openstack Neutron map to concepts in Cisco ACI:


2 - Create sequence diagrams to make the interaction clear to all the members of the team


3 - Understand how the API exposed by Openstack Neutron and from Cisco APIC work, how they are invoked and what results they produce

This is a call to the Cisco APIC controller, using XML


This is a call to the Openstack Nova API, using JSON:

to do this, we used a REST client to learn the individual behavior and how the parameters need to be passed
a REST call is essentially a http call (GET or POST) where the body contains XML or JSON documents
some http headers are required to specify the content type and to hold security information (like a token for single sign on, that is returned by the authorization call and you need to resend in all the following calls to be recognized.
So we adopted Google Postman, that is a plugin for the Chrome browser (latest version is also released as a standalone application) to practice with the REST Calls then,after we learned how to manage them, we just copied the same content (plus the headers) into the “http call” tasks in the CPO workflow editor.



The XML or JSON variables that we passed are essentially static documents with some placeholders for current values, i.e. the Tenant name, the Network name, etc. were passed according to the user input.
Of course the XML elements tags are described in the APIC product documentation, you don’t have to reverse engineer their meaning   ;-)
Another way to get the XML ready to use is to export it from the APIC user interface: if you select an object that has been created already (either though the GUI or the API), you can export the corresponding XML definition:



This is how we copied the XML content from the test made in Postman and replaced some elements with placeholders for current values (that are variables in the workflow designer):

This is how the variable appear in the workflow instance viewer, after you have executed the process because a user ordered the service:


4 - Implement workflows in the CPO orchestrator to call the APIC controller and reuse the existing services in Cisco IAC

An example of the services that Cisco IAC provides out of the box.
They are also available through the API exposed by the product, so we created a custom workflow that reused some of the services as building block for our use case implementation.
his is the workflow editor, where we created the orchestration flow:



5 - integrate Hyper-V
At the time of this project, a direct support for Microsoft Hyper-V was not available in Openstack Nova.
But a free library was available from Cloudbase, so we decided to install it on our Hyper-V serverso that the virtual data center (VDC) we had created in Cisco IAC thanks to the integration with Openstack could use also Hyper-V resources to provision the VM.
More detail on the integration can be found here: http://www.cloudbase.it/openstack/
In the current Openstack release (Juno), Hyper-V servers are managed directly.


6 - create a new service in the Service Catalog

Conclusion

This project had a complexity that derived from being the among the first teams in the world to try the integration of so many disparate technologies: Cisco software products for Service Catalog and Orchestration, three hypervisors (ESXi, Hyper-V, and KVM), physical networks (Cisco ACI) and virtual networks in all the hypervisors, Openstack.
I didn't tell you, but also load balancers and firewalls were integrated.
Maybe I will post some detail about the Layer 4 - Layer 7 service chaining in the next weeks.
We had to learn the concepts before learning the products. Actually theinvestigation of the API and their integration was the easiest part... and was also fun for my ancient memory of programmer   :-)

Now, with the current release of the products involved in this project, everything would be much easier.
Their features are more complete (actually the integration of the Neutron API in the management of Virtual Data Centers in ACI was fed back to our engineering during this project).
Skills available on the field are deeper and widespread.

I've already implemented the same use case with alternative architectures twice.
Cisco UCS Director was used once, replacing the IAC orchestration and pre-built services.
And, in another variation, the Openstack API were integrated directly instead of reusing the existing services that manage the Openstack VDC in IAC.
Just to have more fun... ;-)

Thursday, January 15, 2015

The Elastic Cloud Project - Architecture

This posts is the continuation of The Elastic Cloud Project post.

There is a team at Cisco, called System Development Unit, that creates reference architectures and CVD (Cisco Validated Design).
They work with the product Business Units to define the best way to approach common use cases with the best technology.
But at the time of this project, they hadn’t completed their job yet (some of the products were not even released).
So we had to invent the solution based on our understanding of the end to end architecture and integrate the technologies on the field.

As I explained before, the most important components were:
- servers - Cisco UCS blades and rack mount servers
- network - Cisco ACI fabric, including the APIC software controller
- virtualization - ESXi, Hyper-V, KVM
- cloud and orchestration software - Cisco PSC and CPO, Openstack (PSC and CPO, plus pre-built services, make up Cisco Intelligent Automation - IAC)
IAC can integrate different “element managers” in the datacenter, so that their resources are used to deliver the cloud services (e.g. single VM or Virtual Data Centers - VDC).
Element managers include vmware vCenter and Openstack, so a end user can get a VDC based on one of these platforms.
There is a autometed process in IAC, called CloudSync, that discovers all the resources available in the element managers and allow the admin to select those he wants to use to provision services (resource management and lifecycle management are amongst the features of the product).

The ACI architecture
I will cover it in detail in one of next posts, but essentially ACI (http://www.cisco.com/c/en/us/solutions/data-center-virtualization/application-centric-infrastructure/index.html), that stands for Application Centric Infrastructure, is a holistic architecture with centralized automation and policy-driven application profiles. ACI delivers software flexibility with the scalability of hardware performance. 
Cisco ACI consists of:



The policies that you create in the software controller (APIC) are enforced by the fabric, including physical and virtual networks.
You describe the behavior your application need from the network, non the configuration you need.
This is easier for the application designer, in the collaboration with network managers, because it can be graphically described by the Application Network Profile.
A profile contains End Point Groups (EPG, representing deployment units of the application: both physical and virtual servers) and Contracts (that define the way EPG can communicate).
A profile can be saved as a XML or JSON document, stored in a repository, participate in the Devops lifecycle, used to clone a environment and managed by any orchestrator.
ACI is integrated with the main virtualization platforms (ESXi, Hyper-V, KVM).

To deploy our 3 tier application on 3 different hypervisors, we had to manage vmware and Openstack separately - but in a single process, because everything should be provisioned with a single click.
Initially we based our custom implementation of the new service on the standard IAC services, using them as building blocks.
So we had not to implement the code to create a network, create a VDC, create a Virtual Server, trigger CloudSync, integrate the virtual network with the hardware fabric.
This sequence of operations was common to the Openstack environment and the vmware environment.
The main workflow was built with two parallel branches, the Openstack branch (creating 2 web servers on one network and 1 application server on anothernetwork) and the vmware branch (doing the same for the database tier).



The problem is that the integration of IAC with Openstack, in the 4.0 release that we used at that time, only deals with Nova - that, in turn, manages both KVM and Hyper-V servers.
No Neutron integration was available out of the box, hence no virtual networks for the Openstack based VDC.
So we built the Neutron integration form scratch (implementing direct REST calls to the Neutron API) to create the networks.

The ACI plugin for Neutron does the rest: it talks to the APIC controller to create the corresponding EPG (End Point Group).
This implementation has been fed back to IAC 4.1 by the Cisco engineering, so in the current release it is available out of the box.

Solution for Openstack
A plugin distributed by Cisco ACI was installed in Openstack Neutron, to allow it to integrate into the APIC controller.
This is transparent to the Openstack user, that goes on working in the usual way: create network, create router, create VM instances.
Instructions are sent by Openstack to APIC, so that the corresponding constructs are deployed in the APIC data model (Application Profiles, End Point Groups, Contracts).
The orchestrator can then use these objects to create a specific application logic, spanning the heterogeneous server farms and allowing networks in KVM to connect to networks in ESXi and Hyper-V.
So the workflow that we built only needed to work with the native API in Openstack.

Logical flow:
— web tier --
create a virtual network for the web tier via Neutron API
     the Neutron plugin for ACI calls - implicitly - the APIC controller and creates a corresponding EPG.
     the Neutron plugin for OVS creates a virtual network in the hypervisor's virtual switch
trigger the CloudSync process, so that the new network is discovered and attached to the VDC
create a VM for the web server and attach it to the network created for the web tier
     this was initially done by reusing the existing IAC service “Provision a new VM"
— application server tier — 
create a virtual network for the app tier via Neutron API
     the Neutron plugin for ACI calls - implicitly - the APIC controller and creates a corresponding EPG.
     the Neutron plugin for OVS creates a virtual network in the hypervisor's virtual switch
trigger the CloudSync process, so that the new network is discovered and attached to the VDC
create a VM for the application server and attach it to the network created for the app tier
     this was initially done by reusing the existing IAC service “Provision a new VM"
— connect the tiers via the controller —  
connect the two EPG with a Contract, that specifies the business rules of the application to be deployed
     this is done via the APIC Controller’s API, creating the Application Profile for the new application in the right Tenant


Solution for vmware
The APIC controller has a direct integration with vmware vCenter, so the integration is slightly different from the Openstack case:
The operations are performed directly against the APIC API and, when you create a EPG there, APIC uses the vCenter integration to create a corresponding virtual network (a Port Group) in the Distributed Virtual Switch.
So we added a branch to the main process to operate on APIC and vCenter, to complete the deployment of the 3 tier application with the database tier. 

Logical flow:
— database server tier — 
call the APIC REST interface, implementing the right sequence (authentication, create Tenant, Bridge Domain, End Point Group, Application Network Profile).
     specifically a EPG for the database tier is created in the APIC data model, and this triggers the creation of a port group in vCenter.
trigger the CloudSync process, so that the new network is discovered and attached to the VDC
create a VM for the database server and attach it to the network created for the app tier
     this was initially done by reusing the existing IAC service “Provision a new VM"


Service Chaining
The communication between End Point Groups can be enriched by adding network services: load balancing, firewalling, etc.
L4-L7 services are managed by APIC by calling external devices, that could be either physical or virtual.
This automation is based on the availability of device packages (set of scripts for the target device), and a protocol (Opflex) has been defined to allow the declarative model supported by ACI being adopted by all 3rd party L4-L7devices. 
Cisco and its partners are working through the IETF and open source community to standardize OpFlex and provide a reference implementation.


In next post, I will describe the methodology we used to integrate the single pieces of the architecture, how we learned to use the API exposed by the target systems (APIC and Openstack) and to insert these calls into the orchestration flow.

Link to next post: The Elastic Cloud Project - Methodology


Wednesday, January 7, 2015

The Elastic Cloud Project

The Elastic Cloud was a pilot project that I led as a software architect for a customer.
They wanted to evaluate the complexity and the overal quality of a end to end solution provided by Cisco for Cloud Computing.
So we built a complete infrastructure, including hardware and software, for a public cloud.
The number of use case was limited, and I will tell you about the most curious one.
When I was told about the requirement, I thought it didn’t make sense in the real world… but later I understood the rationale  :-)

The request was to deploy a three tier application, with a single click, provisioning also all the needed infrastructure at the same time: (virtual) servers, network and storage.
So this is an example of software defined datacenter, not only SDN but the entire stack: from hardware to the sw application. 
The strange aspect of this requirement is that every tier of the application should run on a different virtualization platform: web servers on KVM, application servers on Hyper-V, database server on ESXi.
It was a technical demonstration, sure, but it also reflected a real world use case. They have some customers that, for legacy applications, have certification constraints that mandate a specific hypervisor for at least one part of the application. So the deployment cannot be standardized on a single platform.
Our customer wanted to verify that Cisco is able to orchestrate a multi vendor infrastructure and that the advantages of stateless computing (I will explain it in a different post, but essentially it is joining SDN with the quality of a hardware infrastructure) are not compromised by the etherogeneity of the virtualization layer.
They said that no other vendor was able to implement the entire use case on the 3 major hypervisors together.

This is a very high level description of the use case:



We decided to implement the self service process in the portal provided by Cisco Prime Service Catalog.
The orchestration layer was Cisco Process Orchestrator, interfacing with Openstack and vmware vCenter directly.
Openstack had compute nodes running KVM and Hyper-V.
Virtual networks were created, on demand, in the virtual switches in each hypervisor and then joined by the physical network connecting the virtualized servers.
The configuration of the network was done through the APIC software controller, that is part of the Cisco ACI architecture (Application Centric Infrastructure).


There is a solution, offered by Cisco, that provides cloud services out of the box.
It is called Intelligent Automation for Cloud (IAC) and it is based on the Prime Service Catalog (PSC) as a front end, the Cisco Process Orchestrator (CPO) and prebuilt services created by the Cisco engineering.
IAC integrates your infrastructure by interfacing with vCenter, vCD, Openstack and other so called “element managers” and has all the tools to manage the resources lifecycle.
We decided to reuse some of the existing services as building blocks, wrapping them in a custom process implementing the logic of the specific use case.
So we created a workflow in the visual editor of CPO that invokes the existing atomic services: create a virtual network, create a VDC (virtual data center), create a virtual server.
Then we added explicit calls to the API of the target systems that were not integrated in IAC: the APIC Controller for the ACI fabric, Neutron as the manager of the networking in Openstack (the 4.0 release of IAC only managed Nova for provisioning VMs in Openstack, now also Neutron is integrated in IAC 4.1).

So the effort in this project was only dedicated to understanding the overall logic of the use case and implementing the needed API calls.
It was not particularly difficult, because the target API are all based on a REST interface (both for APIC and Neutron) so invoking them from CPO was a kids’ play.
We created a process with 3 branches, one for each tier of the application, creating all the needed networks and virtual machines, then “plumbing” all together with the ACI fabric through the API of the APIC controller.

We were forerunners, trying to implement a exotic use case before the Business Units looked at it… now everything is available out of the box    ;-)
So we faced the following issues:
  • Lack of reference architecture and some products features.
  • Dispersed team (time zones and location) made the coordination difficult.
  • Fragmented skills: none of us had the complete knowledge of all the products and technologies, due to the amount of innovation involved. 
  • Multitasking: many of us were working part time, engaged on different projects.
  • Generous support from individuals and organizations in the company, but limited governance
  • Products limitations discovered in progress (and solved with... fantasy)
  • Usage of beta code and daily builds for some of the products
  • Limited documentation available  

In next post I will tell the entire story and how we were able to demonstrate the main concepts:
  • Cisco ACI is one of the best solutions for SDN and is not limited to software overlay only
  • Cisco has a end to end solution for cloud, including sw and hw and people that can design the architecture
  • Open source solutions can be easily integrated into a commercial architecture, providing additional value
  • The contribution Cisco provided to Openstack allows customers to manage our network fabric from Neutron seamless