Thursday, April 30, 2015

Guiding your Product Development through Voice of the Customer

“Great companies are built on great products.”    
- Elon Musk

Elon Musk was spot on about the importance of product development. Though many people may misunderstand a great product to an aesthetically pleasing product. The beauty of great product does not just lie in aesthetics, but rather on how elegantly the product works. It needs a right vision, the purpose and revolutionary changes. As Steve Jobs once said,

"Design is the fundamental soul of a man-made creation that ends up expressing itself in successive outer layers of the product or service. The iMac is not just the color or translucence or the shape of the shell. The essence of the iMac is to be the finest possible consumer computer in which each element plays together."

As much as revolutionary ideas are important to make any product a breakthrough invention, it is the successive iterative improvement which makes the product more usable, finely tuned and each element within plays together. Even if the product is first to market, it is the iterative improvement which helps an organisation to surge ahead in competitions. In order to improve, one needs to enable the feedback loop from the product users to product designers and developers. It is true even when the product is under development or product out in the market. While the quality assurance team, beta users helps under development product team, it is the real users who help in the development  of the product in the later stage. The objective of this article is to discuss the second part of this process.

Brief History

Iterative product development is not a new idea. It is natural that almost all things are built, developed over multiple iterations. Each iteration eliminates the bad and incorporates possible good into the things in concern. It is evolutionary in nature. Successive iteration done over a period of time changes the product completely from its inception. Compare the first digital computer ever developed to a smartphone of today's! Although they both are indeed digital computers, their purpose and uses are completely different. These are the improvement or rather changes that are made by considering some use cases which are called as functions and the value derived from it. Thus, I can arguably say that structured product development began during the World War II and goes by the name ‘Value Engineering’. Quality function deployment is another notable methodology which tries to make use of user’s feedback.

Changing Technology Landscape

With every new disruptive technology, it radically changes how we do the business. In a pre-era of Internet, the feedback was captured through customer surveys. These surveys collected over a period of time, are analyzed and feedback is incorporated into new improved version of any product. This process used to span over months and involved exhaustive human resources. The very same survey conducted offline can be done online and with very fewer resources. Each advancement in computing technology and adoption of it in the business process makes the process increasingly manageable. Meanwhile, the widespread adoption of Internet and new ways of communication are also changing behaviour of the customers. Now the agony of customer are not written and mailed through the post, rather it may be tweeted or commented on the wall of facebook page. Once quiet customer is now becoming more vocal. Their problems are not only sent to the organisation but also broadcasted to millions of other customers. This may affect adversely to any business if not handled properly.  Today Millions of customer support tele-calls are transcribed every day. The information explosion is real.  But to our relief with the information explosion, we are also better equipped to handle such amount of information and harness the power of data.


Like every automation technology, advancement in Natural Language Processing and ability to handle the large amount of data using distributed computing reduces the resources spent in processing customer feedback and increase the effectiveness of such system. This will eventually shorten the time to market or product development iteration time and help organisations to outdo their competitors. Now, we will be able to process these customer feedback written in free flowing text, extract relevant information, structure and organise it according to the specification. Later these extracted information is converted into functions and prioritised by considering how many people are facing it and various other ranking parameters. One can even go an extent of quantifying how emotionally customers are connected to issues and prioritise them.  One more important aspect of customer's voice is that finding the "unknown unknown". It is now easier to mine customer suggestion and incorporate new functions which are never considered by core development team. This helps any organisation to innovate with customer acceptance validations.

Although, these technologies helps us to crunch the large amount of data, extract meaning, organise them, the automation has no use if it is not put into use in the process of product development(PDP). Every organisation has their own processes even though they are similar in nature. These processes which are needs to be adapted to use the Voice of the Customer effectively.

Let us go through an example of how to use the voice of customers to improve a product. Suppose you are a smartphone manufacturer and you have a flagship phone received well by top critics and most users. Your sales figures are good. But you would like to listen carefully what users says about the phone on social media and reviews site to identify what are the top most problems which you might want to fix. To get to there, you list all credible reviews site where your phone is reviewed. Then grab all the reviews, split them by sentence, score each sentence with its sentiment, extract key phrase which are features in your phone which appeared negative sentences. Once you have got features, rank them by how many people are facing the problem and how important the issue is. Then the successive course of action is to prioritize customer care to handle such issues better and to figure out what can be done to fix such issues. In essence, ranking is a major tool to optimize.

What’s Next?

Much of the core technology parts used in processing customer feedback such as ‘sentiment  analysis’, ‘feature extraction’, ‘ranking’ can be reused in much larger Customer Experience Management, Brand Management domains. Also, these feedback can be quantified to aid strategic decision-making capabilities of any organisation. A complete elegantly integrated system will certainly help any organisation to serve their customer needs and adapt to changing landscape.

I will be primarily writing about CEM, Brand Management and Voice of the Customer in this space. We at Datoin are developing those very core technologies that can assist in building a complete suite of applications which solves the aforementioned problems. See you soon with something new to talk about. Stay Tuned!

A data pipeline for the Internet era

Hold your breath, count three, two, one and release. Congratulations! You just spent three seconds of your life while you did it. Guess what all happened on internet in between?
  • How many new tweets got tweeted on Twitter?
  • How many posts got posted on Facebook, Google+?
  • How many new videos got uploaded to Youtube
  • How many new images to Instagram?
  • How many articles got posted in WordPress, Blogger?
  • How many questions asked on Quora and StackExchange sites, how many answers are addressed to those questions?

While the above questions seem to be rhetorical, however trying to find approximate numbers gives the sense of how speedy the online world is. New users are being added to the Empire of Internet, as we are moving towards the connection of every person and thing on this planet to the Internet. We are seeing entrepreneurs who have facilitated freedom to express and share our opinions online. Many varieties of devices are invented from time to time which are always ready to consume our content. I don't need to move myself near a big fat machine, wait for minutes to turn that thing on and connect to the Internet to write and share my opinions; the smart-phones and tablets of today are always on, always connected, eagerly waiting, hungry to consume what I generate.

Consequently, the volume of Information is proliferating! The other part of the story - how do we deal with such a huge volume of information? As an old proverb says - "Necessity is the mother of invention", the necessity to deal with huge volume of information lead to the creation of amazing frameworks. Thanks to the engineers who made up mind to decipher the hidden hints to bring the solutions. 'Though not every problem is solved, at least not yet', but some good souls have contributed their creation by open sourcing, so anyone can explore and improve on it. As a result, a lot has been changed in the past decade(2005 - 2015). If you happened to be stuck on some text analysis operation such as clustering a few millions of documents a decade ago, it would have been a difficult situation. It's a different game altogether today as your current toolbox is equipped with a plethora of capable tools. If I have to mention one and just one tool suit, I opt Apache Hadoop.

Though we are aware of the story of Hadoop emerging out of the platform for running a distributed crawler(Apache Nutch), the way it walked in the past years is astonishing. It has evolved to a state where it can manage thousands of nodes to deal with petabytes of data without worrying about what application you run on top and how you run. Yes, It is the defacto big data operating system. It has evolved into a prominent ecosystem. Apart from the default filesystem, the HDFS, we see a variety of data persisting solutions, each crafted to provide a missing functionality or supersede its precursors. Whether we need to store content in a sequentially accessible file for processing the whole file in a batch (HDFS SequenceFile), a data store for random, real-time read/write access (Hbase), if we like SQL-like warehouse(Hive) - based on application requirement - we have got one!

Just like storage services witnessed richness in features, the computing part too moved on, it is not limited to plain assembly instructions of distributed computing - simple map-sort-partition-reduce, we have got the high-level statements built using these assembly instructions! We have found Pig to script the tasks. We have seen Oozie workflows to connect the stubbornly independent steps. I was amazed when I tried to rebuild Oozie workflow using Apache Tez's DAG at runtime. It's a monsoon season(/party time) for data scientists!

Let us see how smart people are riding the wave by harnessing the power of the big-yellow-elephant to analyze inundant data. Here is a huge list of organizations which have put Hadoop to work. Some of them have contributed back by fixing issues, adding and perfectioning features, developing better tools. As a result, a lot got shared across organizations by active participation. This give-and-take business is not just for the organizations, but also the computing disciplines are bartering in another way. For instance, Machine Learning + Natural Language Processing is complementing Big Data and vice versa.

The currency is not the data itself, but the information hid inside is! What's the use of data if we do not have the luxury of analytics? How effective is analytics if we do not have visualizations to grasp in a minute or less? As the majority of the Internet content is natural text penned by humans, we definitely seek natural language processing and machine learning to get the insights. On the other hand, some of the complex natural language processing problems which demanded enormous data to employ machine learning solutions are now more accurate as we got more data to feed the learning algorithms. As people say - It is the best time.

If you are stuck with the document clustering problem that I mentioned earlier, it is no more a difficult situation to cluster a twelve-plus-digit number of documents. You would probably play with algorithms of apache mahout or apache spark and run k-means. What if your analysis requires a sequence of tasks such as web crawling, extraction, sentiment analysis and visualization? You are going to form a data pipeline for carrying out all the steps in a sequence.

In essence, pipeline processing has a potential to tackle complicated tasks at the Internet scale. We have built a pipeline processing platform to assemble the components (a piece of software which solves a simple task) to make useful applications and run these applications on a cluster of nodes. For instance, if you wish to mine what users on Internet are speaking about specific brands, you can:
  • you need to gather data from world wide web; just grab a crawler component and configure the sources.
  • Grab an extractor and connect to the crawler. Specify all the fields which you wish to extract.
  • Grab a Sentiment Analysis module and connect it the extractor.
  • Optionally, grab and connect an aggregator module for aggregation for sentiments.
There you go, a pipeline will be ready to process data from websites. Visit Datoin and build your first pipeline application.