Big Data:Principles and best practices of scalable realtime data systems by Nathan Marz
Services like social networks, web analytics, and intelligent e-commerce often need to manage data at a scale too big for a traditional database. As scale and demand increase, so does Complexity. Fortunately, scalability and simplicity are not mutually exclusive- rather than using some trendy technology, a different approach is needed. Big data systems use many machines working in parallel to store and process data, which introduces fundamental challenges unfamiliar to most developers. Big Data shows how to build these systems using an architecture that takes advantage of clustered hardware along with new tools designed specifically to capture and analyze web-scale data. It describes a scalable, easy to understand approach to big data systems that can be built and run by a small team. Following a realistic example, this book guides readers through the theory of big data systems, how to use them in practice, and how to deploy and operate them once they're built. AUDIENCE This book requires no previous exposure to large-scale data analysis or NoSQL tools. Familiarity with traditional databases is helpful. ABOUT THE TECHNOLOGY To tackle the challenges of Big Data, a new breed of technologies has emerged. Many of which have been grouped under the term NoSQL. In some ways these new technologies can be more complex than traditional databases and in other ways, simpler. Using them effectively requires a fundamentally new set of techniques