All Indian Reprints of O'Reilly are printed in Grayscale
With the explosion of data, computing power, and cloud data warehouses, SQL has become an even more indispensable tool for the savvy analyst or data scientist. This practical book reveals new and hidden ways to improve your SQL skills, solve problems, and make the most of SQL as part of your workflow.
Minimum quantity for "SQL for Data Analysis: Advanced Techniques for Transforming Data Into Insights" is 1.
This thoroughly revised guide demonstrates how the flexibility of the command line can help you become a more efficient and productive data scientist. You'll learn how to combine small yet powerful command-line tools to quickly obtain, scrub, explore, and model your data. To get you started, author Jeroen Janssens provides a Docker image packed with over 80 tools--useful whether you work with Windows, macOS, or Linux.
Minimum quantity for "Data Science at the Command Line: Obtain, Scrub, Explore, and Model Data with Unix Power Tools, 2nd Edition" is 1.
As you move data to the cloud, you need to consider a comprehensive approach to data governance, along with well-defined and agreed-upon policies to ensure your organization meets compliance requirements. Data governance incorporates the ways people, processes, and technology work together to ensure data is trustworthy and can be used effectively. This practical guide shows you how to effectively implement and scale data governance throughout your organization.
Chief information, data, and security officers and their teams will learn strategy and tooling to support democratizing data and unlocking its value while enforcing security, privacy, and other governance standards.
Minimum quantity for "Data Governance: The Definitive Guide - People, Processes, and Tools to Operationalize Data Trustworthiness" is 1.
While Excel remains ubiquitous in the business world, recent Microsoft feedback forums are full of requests to include Python as an Excel scripting language. In fact, it's the top feature requested. What makes this combination so compelling? In this hands-on guide, Felix Zumstein--creator of xlwings, a popular open source package for automating Excel with Python--shows experienced Excel users how to integrate these two worlds efficiently.
Minimum quantity for "Python for Excel: A Modern Environment for Automation and Data Analysis" is 1.
Data pipelines are the foundation for success in data analytics. Moving data from numerous diverse sources and transforming it to provide context is the difference between having data and actually gaining value from it. This pocket reference defines data pipelines and explains how they work in today's modern data stack.
You'll learn common considerations and key decision points when implementing pipelines, such as batch versus streaming data ingestion and build versus buy. This book addresses the most common decisions made by data professionals and discusses foundational concepts that apply to open source frameworks, commercial products, and homegrown solutions.
Minimum quantity for "Data Pipelines Pocket Reference: Moving and Processing Data for Analytics" is 1.
Working with unbounded and fast-moving data streams has historically been difficult. But with Kafka Streams and ksqlDB, building stream processing applications is easy and fun. This practical guide shows data engineers how to use these tools to build highly scalable stream processing applications for moving, enriching, and transforming large amounts of data in real time.
Mitch Seymour, data services engineer at Mailchimp, explains important stream processing concepts against a backdrop of several interesting business problems. You'll learn the strengths of both Kafka Streams and ksqlDB to help you choose the best tool for each unique stream processing project.
Minimum quantity for "Mastering Kafka Streams and ksqlDB: Building Real-Time Data Systems by Example" is 1.
You may know SQL basics, but are you taking advantage of its expressive power? This second edition applies a highly practical approach to Structured Query Language (SQL) so you can create and manipulate large stores of data. Based on real-world examples, this updated cookbook provides a framework to help you construct solutions and executable examples in severalflavors of SQL, including Oracle, DB2, SQL Server, MySQL, andPostgreSQL.
Minimum quantity for "SQL Cookbook Second Edition: Query Solutions and Techniques for All SQL Users" is 1.
For self-service data preparation, Tableau Prep is relatively easy to use as long as you know how to clean and organize your datasets. Carl Allchin, from The Information Lab in London, gets you up to speed on Tableau Prep through a series of practical lessons that include methods for preparing, cleaning, automating, organizing, and outputting your datasets.
Minimum quantity for "Tableau Prep: Up & Running: Self-Service Data Preparation for Better Analysis" is 1.
How can you use data in a way that protects individual privacy but still provides useful and meaningful analytics? With this practical book, data architects and engineers will learn how to establish and integrate secure, repeatable anonymization processes into their data flows and analytics in a sustainable manner.
Minimum quantity for "Building an Anonymization Pipeline: Creating Safe Data" is 1.
Time series data analysis is increasingly important due to the massive production of such data through the internet of things, the digitalization of healthcare, and the rise of smart cities. As continuous monitoring and data collection become more common, the need for competent time series analysis with both statistical and machine learning techniques will increase.
Covering innovations in time series data analysis and use cases from the real world, this practical guide will help you solve the most common data engineering and analysis challenges in time series, using both traditional statistical and modern machine learning techniques.
Minimum quantity for "Practical Time Series Analysis: Prediction with Statistics and Machine Learning" is 1.
Data is at the center of many challenges in system design today. Difficult issues need to be figured out, such as scalability, consistency, reliability, efficiency, and maintainability. In addition, we have an overwhelming variety of tools, including relational databases, NoSQL datastores, stream or batch processors, and message brokers. What are the right choices for your application? How do you make sense of all these buzzwords?
Minimum quantity for "Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems" is 1.