Agile Meets Research / Data Scientists
As agile spreads wider and wider I often get to work with researchers (a.k.a Data Scientists) working closely with product development. When going agile these people struggle to figure out how it fits their unique style of work. One of those researchers I encountered on an Advanced Analytics group in Intel was Shahar. We had a chat recently and I asked him if he would be so kind to write a guest post describing his perspective. And he delivered! If you’re a researcher trying to make agile work or you’re implementing agile and you’re trying to help your researchers figure it out, this should be interesting!
5 Guiding Principles in Implementing Agile Methodology in Research-Intensive Software Environments
Agile software development is becoming the common practice in more and more software organizations. For those organizations that use to work in the traditional, waterfall way, going agile means letting many myths go. It’s never easy, however, when dealing with pure functional products, it is often the case that within a reasonable period of time, the superiority of the new method become clear, and that often wins the argument.
In addition to the regular and inherent pain that is caused by change, implementing agile in domains that are research intensive (such as, with products that are based on machine learning) has some additional obstacles. Research contains significantly higher degree of uncertainty. Research means finding new knowledge, that is currently unavailable, and predicting the exact form of that knowledge, and the resources that are needed to find it is hard. Moreover, researchers are trained and educated in academy, where in many cases there are no customers to collaborate with, comprehensive documentation is the ultimate outcome, and the tightest compliance with tools and standards is required, in order to compare with previous works. Not surprisingly, the term “research” is intimidating to many managers and business personnel.
The question: “is agile methodology appropriate in research intensive software environments?” seems rhetoric, given the title of this post, but it is being asked by many many software organizations, including organizations that run agile fluently, in their pure functional features. Undoubtedly, the answer is a big yes; however, research intensive software environments has some unique characteristics that need to be considered during almost every aspects of agile implementation.
In this post I suggest 5 key considerations, that deserve closer attention, when implementing agile methodology in research-intensive software environments.
1. It is not all or nothing
In pure functional features, there are two important considerations: correctness and engineering quality. The team will typically not commit to a feature if it cannot be supplied correctly and with sufficient quality. In functional features correctness is usually well defined. In research intensive features, the knowledge that is required for developing the new feature is initially absent, and so is often the exact outcome. In many cases, there may be several levels of correctness. Consider for example an analytic feature that includes a prediction model. With no model, the prediction is random. With the “optimal” model, the prediction accuracy might be, for example, 95%. However, the concept of an optimal model is merely theoretical, and getting near this 95% requires a lot of efforts. It is often the case that really quickly a skillful data scientist can come with a model that can provide 85% of accuracy, or even 90% of accuracy. Assuming that this model is implemented with high engineering quality, the question of correctness remains. Going agile requires many times to prefer quick solutions over optimal ones. As a matter of fact, prediction tasks are an easy example, in the sense that the solutions have objective measures of correctness (the prediction accuracy in this case). How would you measure or compare algorithms that cluster objects into groups, or algorithms that automatically summarize long documents? Research domains often require baby steps, and solutions that improve iteratively. Research is a product that evolves continuously and not a one-time project. Giving up the aim for optimization is sometimes hard for researcher who were educated in academy, where user engagement is no consideration, and it often requires a cultural change.
2. Create a buffer to reduce the impact of uncertainty
Uncertainty regarding the outcomes is common to any significant research. It is true that any planning includes some uncertainty, but in research, the level of uncertainty is by far higher. This huge uncertainty makes it dangerous to include the research and the implementation of the same feature, in a single iteration or plan. Separating the two phases into two different (possible successive) iterations allows more room for surprises. In research, failing to prove feasibility is an acceptable outcome (finding another way that is not working is frustrating, but it happens a lot). Failing to deliver a feature is less acceptable. Since the outcome of the research phase might result in infeasibility, and in order to prevent hunger from the implementation team, there should be some buffer between the two phases. Large buffers, however are not recommended, because large buffers detach the research effort from the customers’ needs.
3. Build heterogeneous teams
Diversity and heterogeneity are blessed, in many cases. Unfortunately, many organizations see their researchers as individual contributors. Researchers are viewed as highly smart individuals, yet strange and sometimes hard to work with. Individual-contributor researchers get very little feedback from customers, very little feedback from developers and very little feedback from product owners. No wonder that individual contributors in many cases indeed lose their connections with reality. Not less important, in the individual contributor mode, customers, product owners and developers get very little feedback from the researchers, and often tempted to think that research is some sort of magic. When this is the case, the product comes with unrealistic request and expectations, and fail to deliver these requests to the researchers, and developers fail to prepare the infrastructures for implementing the new features. Teams that conduct research should not be comprised solely from researchers. The developers who will implement the feature under research and the product representative should be part of the team.
Moreover, cross-role knowledge sharing should be encouraged. Researchers are often highly curious, open minded individuals, that can contribute in many domains, and sharing business knowledge and development knowledge with them, creates a win-win. Research work is often highly interesting, and sharing it with product and development is also a win-win, since it increases the level of work interest, and helps in aligning expectations.
4. Continuous improvement (retrospective)
Retrospective is in important part of agile. Above ritualism and daily operations, agile is a philosophy, and there should be some amount of freedom in the implementation of that philosophy. When research-intensive domains are involved, retrospective becomes even more crucial, due to the higher level of diversity and complication. Going agile in research intensive environments takes way more time and improvements iterations, and missing the needed room for retrospective jeopardize the level of team satisfaction from the change, the probability of finding some sweet implementation method and eventually the overall success of the change.
5. It’s all about the business
Differently from basic research, in which the sole objective is generating new knowledge, and where wondering about the value of that new knowledge is non-legitimate, in business domain research should be applicative, and serve the business objectives. Now, when written, it hard to argue that every applicative research should stem from a business need and serve that business need. Nonetheless, there are so many efforts, in so many organizations, which are spent on research from no clear reason. This phenomenon especially characterizes emerging and sexy domains (like big data od data science in the recent years). Conducting research with no clear business need is not only an almost certain recipe for failure, it is also stick in the wheels of agile. With no clear business needs there is no clear way to define iterations outcome, no way to commit on anything and no way to proceed. The most that can come out from such non-applicative research is a sudden enlightenment. I know very few organizations that can afford themselves the expectation of arbitrary enlightenment as a business model.
Shahar has over 15 years of experience in leading complex analytics products which are based on intensive research, in many organizations, both startups and enterprises. As a machine learning expert Shahar is well-familiar with needs and difficulties that come with research oriented products but he is also a true agile believer and as a business leader, who always looks for capabilities that drive the business forward fast.