Python Machine Learning
Python Machine Learning
Format: PDF / Kindle (mobi) / ePub
Unlock deeper insights into Machine Leaning with this vital guide to cutting-edge predictive analytics
About This Book
- Leverage Python's most powerful open-source libraries for deep learning, data wrangling, and data visualization
- Learn effective strategies and best practices to improve and optimize machine learning systems and algorithms
- Ask – and answer – tough questions of your data with robust statistical models, built for a range of datasets
Who This Book Is For
If you want to find out how to use Python to start answering critical questions of your data, pick up Python Machine Learning – whether you want to get started from scratch or want to extend your data science knowledge, this is an essential and unmissable resource.
What You Will Learn
- Explore how to use different machine learning models to ask different questions of your data
- Learn how to build neural networks using Pylearn 2 and Theano
- Find out how to write clean and elegant Python code that will optimize the strength of your algorithms
- Discover how to embed your machine learning model in a web application for increased accessibility
- Predict continuous target outcomes using regression analysis
- Uncover hidden patterns and structures in data with clustering
- Organize data using effective pre-processing techniques
- Get to grips with sentiment analysis to delve deeper into textual and social media data
Machine learning and predictive analytics are transforming the way businesses and other organizations operate. Being able to understand trends and patterns in complex data is critical to success, becoming one of the key strategies for unlocking growth in a challenging contemporary marketplace. Python can help you deliver key insights into your data – its unique capabilities as a language let you build sophisticated algorithms and statistical models that can reveal new perspectives and answer key questions that are vital for success.
Python Machine Learning gives you access to the world of predictive analytics and demonstrates why Python is one of the world's leading data science languages. If you want to ask better questions of data, or need to improve and extend the capabilities of your machine learning systems, this practical data science book is invaluable. Covering a wide range of powerful Python libraries, including scikit-learn, Theano, and Pylearn2, and featuring guidance and tips on everything from sentiment analysis to neural networks, you'll soon be able to answer some of the most important questions facing you and your organization.
Style and approach
Python Machine Learning connects the fundamental theoretical principles behind machine learning to their practical application in a way that focuses you on asking and answering the right questions. It walks you through the key elements of Python and its powerful machine learning libraries, while demonstrating how to get to grips with a range of statistical models.
direction of the gradient . In order to find the optimal weights of the model, we optimized an objective function that we defined as the Sum of Squared Errors (SSE) cost function . Furthermore, we multiplied the gradient by a factor, the learning rate , which we chose carefully to balance the speed of learning against the risk of overshooting the global minimum of the cost function. In gradient descent optimization, we updated all weights simultaneously after each epoch, and we defined the
at the following equation: The left side of the preceding equation can then be interpreted as the distance between the positive and negative hyperplane, which is the so-called margin that we want to maximize. Now the objective function of the SVM becomes the maximization of this margin by maximizing under the constraint that the samples are classified correctly, which can be written as follows: These two equations basically say that all negative samples should fall on one side of the
half-moon shapes and the concentric circles, we projected a single dataset onto a new feature. In real applications, however, we may have more than one dataset that we want to transform, for example, training and test data, and typically also new samples we will collect after the model building and evaluation. In this section, you will learn how to project data points that were not part of the training dataset. As we remember from the standard PCA approach at the beginning of this chapter, we
569 samples of malignant and benign tumor cells. The first two columns in the dataset store the unique ID numbers of the samples and the corresponding diagnosis (M=malignant, B=benign), respectively. The columns 3-32 contain 30 real-value features that have been computed from digitized images of the cell nuclei, which can be used to build a model to predict whether a tumor is benign or malignant. The Breast Cancer Wisconsin dataset has been deposited on the UCI machine learning repository and
take a look at two very simple yet powerful diagnostic tools that can help us to improve the performance of a learning algorithm: learning curves and validation curves. In the next subsections, we will discuss how we can use learning curves to diagnose if a learning algorithm has a problem with overfitting (high variance) or underfitting (high bias). Furthermore, we will take a look at validation curves that can help us address the common issues of a learning algorithm. Diagnosing bias and