Data Science

In stock
All Indian Reprints of O'Reilly are printed in Grayscale More organizations than ever understand the importance of data lake architectures for deriving value from their data. Building a robust, scalable, and performant data lake remains a complex proposition, however, with a buffet of tools and options that need to work together to provide a seamless end-to-end pipeline from data to insights. This book provides a concise yet comprehensive overview on the setup, management, and governance of a cloud data lake. Author Rukmani Gopalan, a product management leader and data enthusiast, guides data architects and engineers through the major aspects of working with a cloud data lake, from design considerations and best practices to data format optimizations, performance optimization, cost management, and governance. Learn the benefits of a cloud-based big data strategy for your organization Get guidance and best practices for designing performant and scalable data lakes Examine architecture and design choices, and data governance principles and strategies Build a data strategy that scales as your organizational and business needs increase Implement a scalable data lake in the cloud• Use cloud-based advanced analytics to gain more value from your data
AuthorRukmani Gopalan BindingPaperback
All Indian Reprints of O'Reilly are printed in Grayscale How do you turn raw, unprocessed, or malformed data into dynamic, interactive web visualizations? In this practical book, author Kyran Dale shows data scientists and analysts--as well as Python and JavaScript developers--how to create the ideal toolchain for the job. By providing engaging examples and stressing hard-earned best practices, this guide teaches you how to leverage the power of best-of-breed Python and JavaScript libraries. Python provides accessible, powerful, and mature libraries for scraping, cleaning, and processing data. And while JavaScript is the best language when it comes to programming web visualizations, its data processing abilities can't compare with Python's. Together, these two languages are a perfect complement for creating a modern web-visualization toolchain.  This book gets you started. You'll learn how to: Obtain data you need programmatically, using scraping tools or web APIs: Requests, Scrapy, Beautiful Soup Clean and process data using Python's heavyweight data processing libraries within the NumPy ecosystem: Jupyter notebooks with pandas+Matplotlib+Seaborn Deliver the data to a browser with static files or by using Flask, the lightweight Python server, and a RESTful API Pick up enough web development skills (HTML, CSS, JS) to get your visualized data on the web Use the data you've mined and refined to create web charts and visualizations with Plotly, D3, Leaflet, and other libraries
AuthorKyran Dale BindingPaperback
In stock
All Indian Reprints of O'Reilly are Printed in Grayscale. With technological advancements, fast markets, and higher complexity of systems, software engineers tend to skip the uncomfortable topic of software efficiency. However, tactical, observability-driven performance optimizations are vital for every product to save money and ensure business success. With this book, any engineer can learn how to approach software efficiency effectively, professionally, and without stress. Author Bart?omiej P?otka provides the tools and knowledge required to make your systems faster and less resource-hungry. Efficient Go guides you in achieving better day-to-day efficiency using Go. In addition, most content is language-agnostic, allowing you to bring small but effective habits to your programming or product management cycles. This book shows you how to: Clarify and negotiate efficiency goals Optimize efficiency on various levels Use common resources like CPU and memory effectively Assess efficiency using observability signals like metrics, logging, tracing, and (continuous) profiling via open source projects like Prometheus, Jaeger, and Parca Apply tools like go test , pprof , benchstat , and k6 to create reliable micro and macro benchmarks Efficiently use Go and its features like slices, generics, goroutines, allocation semantics, garbage collection, and more!
AuthorBartlomiej Plotka BindingPaperback
All Indian Reprints of O'Reilly are Printed in Grayscale. Data quality will either make you or break you in the financial services industry. Missing prices, wrong market values, trading violations, client performance restatements, and incorrect regulatory filings can all lead to harsh penalties, lost clients, and financial disaster. This practical guide provides data analysts, data scientists, and data practitioners in financial services firms with the framework to apply manufacturing principles to financial data management, understand data dimensions, and engineer precise data quality tolerances at the datum level and integrate them into your data processing pipelines. You'll get invaluable advice on how to: Evaluate data dimensions and how they apply to different data types and use cases Determine data quality tolerances for your data quality specification Choose the points along the data processing pipeline where data quality should be assessed and measured Apply tailored data governance frameworks within a business or technical function or across an organization Precisely align data with applications and data processing pipelines And more
AuthorBrian Buzzelli BindingPaperback
All Indian Reprints of O'Reilly are Printed in Grayscale. Microsoft Power BI is a data analytics and visualization tool powerful enough for the most demanding data scientists, but accessible enough for everyday use for anyone who needs to get more from data. The market has many books designed to train and equip professional data analysts to use Power BI, but few of them make this tool accessible to anyone who wants to get up to speed on their own. This streamlined intro to Power BI covers all the foundational aspects and features you need to go from "zero to hero" with data and visualizations. Whether you work with large, complex datasets or work in Microsoft Excel, author Jeremey Arnold shows you how to teach yourself Power BI and use it confidently as a regular data analysis and reporting tool. You'll learn how to: Import, manipulate, visualize, and investigate data in Power BI Approach solutions for both self-service and enterprise BI Use Power BI in your organization's business intelligence strategy Produce effective reports and dashboards Create environments for sharing reports and managing data access with your team Determine the right solution for using Power BI offerings based on size, security, and computational needs
AuthorJeremey Arnold BindingPaperback
All Indian Reprints of O'Reilly are Printed in Grayscale. As enterprise-scale data science sharpens its focus on data-driven decision making and machine learning, new tools have emerged to help facilitate these processes. This practical ebook shows data scientists and enterprise developers how the notebook interface, Apache Spark, and other collaboration tools are particularly well suited to bridge the communication gap between their teams.
AuthorJerome Nilmeier. PhD BindingPaperback
All Indian Reprints of O'Reilly are Printed in Grayscale. If you want to work in any computational or technical field, you need to understand linear algebra. As the study of matrices and operations acting upon them, linear algebra is the mathematical basis of nearly all algorithms and analyses implemented in computers. But the way it's presented in decades-old textbooks is much different from how professionals use linear algebra today to solve real-world modern applications. This practical guide from Mike X Cohen teaches the core concepts of linear algebra as implemented in Python, including how they're used in data science, machine learning, deep learning, computational simulations, and biomedical data processing applications. Armed with knowledge from this book, you'll be able to understand, implement, and adapt myriad modern analysis methods and algorithms. Ideal for practitioners and students using computer technology and algorithms, this book introduces you to: The interpretations and applications of vectors and matrices Matrix arithmetic (various multiplications and transformations) Independence, rank, and inverses Important decompositions used in applied linear algebra (including LU and QR) Eigendecomposition and singular value decomposition Applications including least-squares model fitting and principal components analysis
AuthorMike X Cohen BindingPaperback
All Indian Reprints of O'Reilly are Printed in Grayscale. Do your product dashboards look funky? Are your quarterly reports stale? Is the data set you're using broken or just plain wrong? These problems affect almost every team, yet they're usually addressed on an ad hoc basis and in a reactive manner. If you answered yes to these questions, this book is for you. Many data engineering teams today face the "good pipelines, bad data" problem. It doesn't matter how advanced your data infrastructure is if the data you're piping is bad. In this book, Barr Moses, Lior Gavish, and Molly Vorwerck, from the data observability company Monte Carlo, explain how to tackle data quality and trust at scale by leveraging best practices and technologies used by some of the world's most innovative companies. Build more trustworthy and reliable data pipelines Write scripts to make data checks and identify broken pipelines with data observability Learn how to set and maintain data SLAs, SLIs, and SLOs Develop and lead data quality initiatives at your company Learn how to treat data services and systems with the diligence of production software Automate data lineage graphs across your data ecosystem Build anomaly detectors for your critical data assets
AuthorBarr Moses Author 2Lior Gavish
All Indian Reprints of O'Reilly are printed in Grayscale Healthcare is the next frontier for data science. Using the latest in machine learning, deep learning, and natural language processing, you'll be able to solve healthcare's most pressing problems: reducing cost of care, ensuring patients get the best treatment, and increasing accessibility for the underserved. But first, you have to learn how to access and make sense of all that data.
AuthorAndrew Nguyen BindingPaperback
All Indian Reprints of O'Reilly are printed in Grayscale Snowflake's ability to eliminate data silos and run workloads from a single platform creates opportunities to democratize data analytics, allowing users at all levels within an organization to make data-driven decisions. Whether you're an IT professional working in data warehousing or data science, a business analyst or technical manager, or an aspiring data professional wanting to get more hands-on experience with the Snowflake platform, this book is for you.
AuthorJoyce Kay Avila BindingPaperback
All Indian Reprints of O'Reilly are printed in Grayscale For MySQL, the price of popularity comes with a flood of questions from users on how to solve specific data-related issues. That's where this cookbook comes in. When you need quick solutions or techniques, this handy resource provides scores of short, focused pieces of code, hundreds of worked-out examples, and clear, concise explanations for programmers who don't have the time (or expertise) to resolve MySQL problems from scratch.
AuthorSveta Smirnova Author 2Alkin Tezuysal
All Indian Reprints of O'Reilly are printed in Grayscale Data engineering has grown rapidly in the past decade, leaving many software engineers, data scientists, and analysts looking for a comprehensive view of this practice. With this practical book, you'll learn how to plan and build systems to serve the needs of your organization and customers by evaluating the best technologies available through the framework of the data engineering lifecycle.
AuthorJoe Reis Author 2Matt Housley
Show another 12 products