Awesome Chefs – Infochimps Iron(fan)s Enterprise Big Data Stacks w/Chef

Infochimps helps some of the world’s biggest companies make sense of and then act on their troves of data. The Infochimps Cloud delivers Big Data systems with unprecedented speed, simplicity, scale, and flexibility to enterprise companies.

What’s that mean for their customers?

Their cloud-based Platform-as-a-Service enables enterprises to access real-time, historical, and correlated data sets, enabling in-depth analysis that improves business strategy. And thanks to their recent acquisition by CSC, they provide the cutting-edge technology of a startup backed by larger CSC’s $14B global revenue stream and five decades of enterprise expertise.

In other words, Infochimps is a company of super smart people helping their customers get way more out of business info than ever before. They do this using Hadoop (for batch analytics), Storm+Trident (for real-time streaming), and HBase and Elasticsearch (for scalable queries) — plus about four dozen supporting systems behind the scenes — all scaled across hundreds of machines. Oh, and that’s for each client. It’s fundamentally impractical to run infrastructure at this scale without an automation framework.

And, of course, thanks to Enterprise Chef and Infochimps’ own orchestration tool Ironfan (which they’ve been awesome enough to open source), they can do so it all in style. Here’s what Flip Kromer, head of Technology & Architecture at Infochimps, had to say:

“Delivering a managed service as data-intensive as ours makes automation critical, since we have to get up and running quickly regardless of customer environment or specifications. Enterprise Chef’s declarative model lets us make a high-level blueprint of any infrastructure and easily custom-fit our Big Data platform to our clients’ needs – all with a super-reliable code base that’s versioned and repeatable.”

Infochimps’ clients are large-scale enterprises like Cisco and Western Digital/HGST. In that world, there’s no such thing as a standard deployment: enterprise companies don’t want technology, they want a business solution. Infochimps and the client scope the scale, specs, requirements, and technical details of the deployment. Chef and Ironfan lets the Infochimps Cloud meet the customer where they live: in a physical data center, a virtualized one, a private cloud, or a public cloud. In fact, the same high-level Ironfan description can deploy a customer onto the AWS public cloud for initial proof-of-concept, then into their data center for production.

How is that accomplished? Ironfan has four interlocking pieces, all of them open-source:

  1. Industrial-strength Chef Cookbooks describing each of the Big Data and supporting components in their PaaS
  2. A Knife API plug-in coordinating the Chef server and cloud provider. Ironfan defines each machine’s roles and commissions them into existence; Chef provides the code that actually builds out its configuration.
  3. The “Silverware” utilities cookbook, enabling each machine to announce the capabilities it provides and discover the capabilities it requires. For example, a web server machine will dynamically recognize the database machines it should query and the load balancer that sends it requests. Make any localized change, and the system adapts to the new reality — critical for an infrastructure with hundreds of machines in motion.
  4. The Ironcuke test framework turns each machine’s “announcements” into a testable contract. Remember, the only way that an interacting system could discover and use, say, a database is because that database announced its address, port and other particulars. Ironcuke holds it to that promise: there is a server at that address and port, it responds promptly, and is correct in all those particulars. (Ironcuke is a recent addition, and though most people don’t rhapsodize about their test frameworks you can hear the engineers’ excitement at its potential.)

So to spin up a new platform deployment, an Infochimps engineer customizes the standard architecture description, runs a series of tasks using Chef’s Knife tool, and validates that the platform is performing as expected. Once all the lights are green, the client’s new Big Data Compute Stack is live and ready to rock.

Now, there is obviously more to the process than this. But not much more. That’s the beauty of Chef and Ironfan – replacing manual and often complex processes with reliable well-tested code makes things simpler, clearer, and more scalable.

Our friends at Infochimps are not only blazing the way in Big Data, they’re helping set the precedent for the future of IT – automated, customized, and code-driven.

“Chef and Ironfan let us leave behind brittle, manual procedures to a predictable workflow. Now, we can go from zero to production in a few hours,” added Erik Macdanz, Infochimps’ lead DevOps Engineer. “We can get clients up and running crazy fast, and have more cycles to innovate ahead of the curve.”

Read the press release on Infochimps’ use of Enterprise Chef below:

Opscode Helps Enable Infochimps’ “Cloud for Big Data”

Deploys Opscode Enterprise Chef™ to Automate Big Data Cloud Services, Delivering a Fully Managed Big Data Stack in Less Than Six Hours

SEATTLE – November 25, 2013 – Opscode®, the leader in IT automation, today announced that Infochimps, a CSC big data business, and provider of cloud-based Big Data services for the enterprise, has automated its Infochimps Cloud for Big Data with Opscode Enterprise Chef™. Using Enterprise Chef, Infochimps has automated configuration management and application delivery for its Big Data platform, enabling the information analytics innovator to go from zero to full production for its customers in less than six hours.

Infochimps Cloud for Big Data combines real-time, query-based, and Hadoop and batch analytics to solve a wide range of enterprise data challenges. The company’s fully managed Big Data solution eliminates the burden of infrastructure management and integrates with any data center, private and public cloud. To build, manage, and deliver its comprehensive data solutions, Infochimps deployed the hosted version of Enterprise Chef to ensure maximum speed and flexibility in meeting customer needs.

“Delivering a managed service as data-intensive as ours makes automation critical, since we have to get up and running quickly regardless of customer environment or specifications,” said Flip Kromer, head of Technology & Architecture, Infochimps. “Enterprise Chef’s declarative model lets us make a high-level blueprint of any infrastructure and easily custom-fit our Big Data platform to our clients’ needs – all with a super-reliable code base that’s versioned and repeatable.”

“Infochimps approach to Big Data makes a ton of sense – use Chef and the cloud to eliminate the infrastructure burden for clients and then give them unlimited flexibility to analyze and act on their business information,” said Adam Jacob, Chief Dev Officer, Opscode. “I also love that Infochimps are big Chef Community contributors, beginning with Ironfan and stretching into Cookbooks and much more.”

Integrating Enterprise Chef and Ironfan, Infochimps’ open source provisioning, deployment, and updating tool, the company has automated the entire compute stack for its cloud-based Big Data platform. With Chef and Ironfan, Infochimps can install its services in any clients’ IT environment, whether physical servers, virtualized resources, private or public cloud. Using Chef Cookbooks to automate resource configuration and IronFan for resource deployment, Infochimps can deliver fully-featured Big Data stacks in just hours, enabling the company’s customers to access actionable data intelligence in no time flat.

  • Jesse Hu

    Infochimp’s Ironfan is an awesome tool for provisioning and configuring a Hadoop/HBase (and other kind of) cluster in EC2. VMware’s Project Serengeti (http://projectserengeti.org/) enhances and extends Ironfan to make it work perfect in vSphere vCenter virtualization env.

Archives