Data Needs a Platform


Google or the Yellowpages, Uber or Yellow Cab, Netflix or Comcast, Nest or Honeywell, Stitchfix or Nordstrom, Etsy or Bernhardt Furniture, every industry, profession, startup, enterprise, is making a hard shift to digital at a rapid pace. Data exhaust is growing exponentially from every interaction as a by-product of this digital evolution from mobile, web, social, commerce, and many new touch points of the digital landscape. According to Oxford Dictionaries, data is “facts and statistics collected together for reference or analysis.” It is important to note that data is different than information. It is an abstraction from information that lends itself to code and math to make data products. An ecosystem of data suppliers, producers, services, and consumers are emerging to support a dataFirst development practice.

Many factors are accelerating the transition from the offline world to the always on generation including a cultural shift in connectedness, a technology shift from centralized to decentralized computing infrastructure, and an economic shift from cost prohibitive resources to accessible cloud computing, memory, storage, and software due to the rise of Open Source Software (OSS) and indirect monetization business models. Making this transition is not easy and requires data literacy to be competitive and take full advantage of this shift.  

At IBM, we have recognized this shift by declaring our strategic initiatives as cloud computing and cognitive solutions. At the center of this shift is data. Said another way, our clients are moving their business online and in doing so creating data exhaust that can be leveraged for machine learning to build data products like customer service chat bots and teaching assistants. Unfortunately, there isn’t a data platform for working with exhaust data and transactional data to build data products for cognitive solutions.

We have built platforms for application development, Bluemix, and cognitive solutions, Watson, and have yet to build a platform for data to connect the two with a robust data ecosystem of data producer and consumer partners. Instead, as an industry, we have continued to drive product-centric data ecosystems that have succeeded in the past, but are now faltering due to the transforming data consumer. For example, NoSQL data ecosystem (Hadoop, Cassandra, MongoDB), the MPP data ecosystem (Vertica, Netezza, Greenplum, or RDBMS data ecosystem (MySQL, PostGRES, DB2, Oracle, etc.) all have depended on a Business Intelligence consumption model to drive a business process. Dashboards are dead.

On September 26th, IBM will launch the first data platform built on open source software, cloud computing, and include key Watson services to deliver cognitive solutions. Additionally, we’ll introduce dataFirst methods to help clients and partners bridge the gap between digital and cognitive solutions. We’ll introduce dataFirst certifications to extend the data platform by supporting a broad ecosystem of partners building on a single data platform that is open for all. We’ll bring together leading data programs across IBM including consulting services, skills and training, independent software vendor’s, technology leaders, and many others who have an interest in data to a seamless experience to maximize interactions between data producers and consumers.

In the past year, we invested in the open source technology most notably Apache Spark as the Analytics OS and introduced industry leading user experiences for both the data consumers and data producers. Watson Analytics makes analytics consumable and the Data Science Experience makes data producible. Together, these two offerings represent two ends of the data & analytics spectrum. After launching the data platform these two disparate experiences become connected through a fabric with open services to a growing ecosystem of suppliers for data ingestion, persistence, machine learning, orchestration, discovery, and access. In addition to the Data Science Experience and Watson Analytics; other producer and consumer endpoints will also emerge to address every industry and profession. For example, in IoT, we’ll introduce experiences for device makers and application developers. By connecting data producers to data consumers, a data marketplace is born for dataFirst practitioners to collaborate and learn from each other instead of remaining niche providers of disconnected ecosystems. 

Over the next 30 days, read about data literacy, open source software, pipelines to platforms, and shift our industry from product pipelines commoditization to platform growth that services emerging markets. Join us at our dataFirst launch event http://ibm.co/datafirstnow

Leave a Reply