We had a chance to briefly touch on this topic in our founder John's latest webinar with Infosys' Aparna Sharma on "Accelerating Modern Application Quality" -- archive available now.
A question came in during the show that was particularly pertinent to the topic at hand. How can these virtualized services and data models work in the Cloud to allow collaboration with distributed teams, especially offshore test teams? Indeed, without a good approach for virtualizing data in a Cloud-based development environment, you simply cannot get there without wasting about 80% of your effort and cost in testing today's service-based applications.
You expected ubiquitous, flexible access to needed resources with Cloud Computing. So why can testing teams end up sitting on the bench instead of getting involved as early as possible? Much of the challenge lies within the test data itself.
In short there are "Wires hanging out of your Test Cloud" as
we illustrate here - especially when it comes to Data. You can't just
take a 4TB database or an external SaaS-type data provider and plop
them into a hypervisor and provision them in the Cloud. Test Data is a
frontier because it is not easy to replicate via known means - it is either too bulky or too dynamic to reliably move to the Cloud, but we absolutely must provision teams with good test data, in order to do enough testing to rely upon these systems for business.
Primarily there is the problem of valid data access needed for thorough testing. Most often, this is locked up in some
mainframe or live service that is simply too critical to open up to the
testing team. Since business continuity usually trumps new product
testing, IT Ops guys will crack down on test access time windows to the live
systems and the data within them. So you might have a 2 hour period of
access - barely enough time to test an on-premise system, much less a Cloud-based system that is talking to other data providers.
Second, there is the problem of data sensitivity. Even if we have contracts and a good working relationship in place, we don't want to be putting data that needs to stay private in the hands of any third party. So we do need to mask or "desensitize" the data so it follows the expected structure, without BEING someone's social security number or bank account. This has always been a concern in healthcare for instance (see HIPAA guidelines), but the level of certification, compliance and controls around data privacy and security in every industry is only getting tighter. We need to be able to give these remote teams obfuscated data so they can continue testing efficiently.
Third, data setup and teardown continues to be difficult in Cloud environments. We touched on one of our favorite examples in the webinar, a Telco that literally spends 2 hours running tests, and 2 whole DAYS resetting data. Many of these problems wont go away when the Cloud-provisioned system is talking to external dependencies and systems of record that need to be accessed and later cleaned out of running systems. Quite often this is a main point of contention, as a test could corrupt the tests of other teams or even the live systems if not contained correctly.
These are just three concerns. There are many other open issues of how Test Data will be handled when we move to Cloud-based environments. I thought Judith Hurwitz nailed this as one of her observations in her "What are the Unanticipated Consequences of Cloud Computing" post:
"Data will increasingly be seen as a reusable resource that can be used in lots of different situations. There will continue to be strategic line of business applications but they will be more systems of record that keep track of the final result of actions that take place dynamically in the cloud. The value of data is not in its tight packaging as we have been used to for decades but it the flexibility to move, transform, and leverage data. The watch word for data in this new model will be Trusted Data in the Cloud."
Couldn't agree more with this observation. As we move to ever more distributed, flexible computing models, we are supporting these with very distributed development and testing teams. The data needed for the software lifecycle must be shared among business partners, on shore and off-shore teams, so success will all come back to our level of Trust that the Cloud will give us the data we need, and not give us the side effects of data we don't need.
Of course, here at iTKO, we've been stretching out our LISA Virtualize to better support Test Data Management (TDM) solution efforts with the capture, manipulation, dynamic "desensitization" and Virtualization of data used in Cloud-based development environments. Look for more discussion and research from us soon on how expert testing teams are taking advantage of Virtual Test Data in the cloud.

Comments