Taming the Big Data Wave

Data Horde Art1For decades data storage has been a control freak’s dream: Data are “structured” and consistent, dutifully tucked away in disk-based systems that are prodigious in capacity but cantankerous and inflexible, unable to expand or shrink as data flows dictate.

Now comes a data deluge that will make control freaks… freak. A new wave of “unstructured” data is upon us, like one monumental bad hair day, ornery, unmanageable, inconsistent and flighty. The market for handling this unstructured, messy mass could grow almost seven-fold to $20 billion in just the next three or four years, an IDC report says.

That’s a fat, juicy target for a platoon of early-stage firms presenting at the Montgomery Summit, sponsored by Macquarie Capital and set for this coming Tuesday and Wednesday, March 10-11, at the Fairmont Miramar hotel in Santa Monica, Calif. I’ll be there as MC.

These outfits boast names that are fittingly structural given their push to overhaul the core IT infrastructure of businesses: CoreOS, Bracket, Pernix and Nintex, among others. They ply new techniques in virtualization, generic hardware, cloud computing, flash memory and cross-platform software to let large enterprises become Jack-be-nimbles as their data-streams swell up with ever more unpredictable, spiky and hard-to-tame data.

For two decades, structured data dominated storage; it now is at $13 billion of a total $16 billion market, vs. $3 billion for “unstructured,” says Ramana Jonnala, CEO of Coho Data, another presenter at Monty. “But in the next three years, there will be an explosion in unstructured data,” he says. By 2018, that segment could hit $20 billion vs. $16 billion for structured.

The data onslaught emanates from myriad devices—smartphones, tablets, servers, Net sensors, apps, wi-fi hotspots and more. Structured data tells a sneaker maker how many pairs it sold in California last month; unstructured data, late-breaking and fast-changing, can reveal how many pairs might sell next month. Harnessing those insights will require new tools.

Coho Data sells an “appliance” to help manage the messy data and turn it into an advantage: $90,000 for a “whitebox” server with 50 terabytes of storage (flash and disk). Companies can keep their critical data close to home and form private clouds, rather than using, say, Amazon’s S3 “public” cloud at 3 cents a month per gigabyte. “Just the assumption that everything belongs in the public cloud is actually a lazy notion,” Jonnala avers.

More than 40 corporate customers, from Wall Street banks to healthcare companies, have bought his DataStream boxes in eight months on the market. A big data platform at, say, J.P. Morgan Chase can entail three petabytes of storage (that’s 3,000 terabytes or 3 million gigabytes). “In our case that translates to a hundred boxes with our solution,” he says. “The beauty is the customer doesn’t need to buy all three petabytes on Day One. He can take 500 terabytes, one-sixth, and adding the next terabyte is dead simple.” It takes only an hour. “Traditional storage can’t do that.”

While Coho relies on hardware, another Monty presenter, Pernix Data, bets on software. “It’s easy to go say ‘Take my shiny box, someone else has!’ but enterprises need to realize the power is not in that box—hardware they can get anywhere—it’s in the software,” says CEO Pooja Kumar.

Pernix uses crafty software to help storage play catch-up. Storage has fallen behind the advances in power, virtualization software, smartphones and apps. “Storage vendors, the EMC’s of the world, haven’t been able to keep up with the demands of the virtual infrastructure on the compute side,” he says.  Pernix software enlists servers to handle critical and urgent storage tasks using flash memory right there inside the box, instead using an EMC disk array nearby. “We are disrupting the traditional way of thinking, where I need to solve everything on the storage side. No!”

Flash, so aptly named, can process commands a hundred times as fast as disk storage: 200 I/O commands per second for disk, 20-thousand I/Os per second for flash. Its ubiquitous use in smartphones and tablets has raised its scale and power so much that flash now is moving into the enterprise.

That owes largely to the “hyperscale” giants: Google, Amazon, Apple, Facebook, et al. They did the impossible, building out massive networks of cheap, homemade servers and jazzing them up with their own code and flash for superfast access. A corporate network links up 10,000 servers; the hyperscalers were harnessing one million, each one 100 times the power of a mainframe of the 1990s. Unbelievable.

The overhaulers at Montgomery Summit want to bring hyperscale to businesses. “That is a big river of change in the enterprise,” says Rajesh Ghai, senior analyst at summit sponsor Macquarie Capital. “A lot of hyperscale concepts are going to make it to the enterprise, and that what’s driving investment.”

“We’re helping companies run infrastructure like the big guys,” says Alex Polvi, co-founder and CEO of CoreOS. Its Linux-based software, which debuted only last summer, orchestrates batches of servers, from as few as three up to hundreds of thousands, theoretically. One customer testing CoreOS was able to run 100,000 servers at once without any meltdown.

The trickle-down from Google and the hyperscale bunch to the rest of the business world “is changing everything,” says Tom Gillis, CEO of Bracket Computing, also a Monty presenter. “Guess what? Who’s the third-largest server manufacturer in the world? Google, for its own use. That’s an astounding statistic, right?”

Bracket provides a layer of software between enterprises and the cloud, bulletproofing security and managing data flows. “We are like an operating system that runs on top of the cloud,” he says, reluctantly, because “a bunch of people have made that claim, and it’s been bogus.” Bracket launched in October, and DirecTV, GE and Blackstone have been testing its software. Some clients are “rolling it out bigtime,” he says.

The mission at Nintex, another overhaul wizard at Montgomery Summit, is to help businesses tame the mass of unstructured content by automating the impossible. “We automate everyday workflows that haven’t been able to be automated in the past,” says CEO and co-founder John Burton.

Nintex, with more than $50 million in sales, provides the software platform that lets businesses create their own automated processes with surprising ease and without an army of code-crunchers. Click on-screen on various labeled boxes and drag them and re-position them to cover related tasks.

In business since 2007, Nintex now helps automate the tortuous, paper-based, multi-approval processes at 5,000 customer sites with nine million users in 90 countries and 23 languages. Astoundingly, some customers have used the Nintex platform to design and automate 14,000 different paperwork processes at their companies, from new-hire “onboarding” to ad-budget authorization to travel approvals.

The average first-time sale for Nintex is $25,000 and up to $75,000 for partner resellers to install and implement it, which takes only two to four weeks. Each automated workflow shaves 66% off the cost of the paper-based method, and customers earn back their money in six to nine months.

Astoundingly, some Nintex customers have used it to automate 14,000 different paperwork processes, from new-hire “onboarding” to ad-budget authorization to travel approvals.  I think it best to withhold comment here on whether any business rightly should even have 14,000 distinct work-flow processes . . . that is a different problem.   –v–

Scroll to Top