Data Science Essentials in Python
Collect → Organize → Explore → Predict → Value
by: Dmitry Zinoviev
Published | 2016-08-11 |
---|---|
Internal code | dzpyds |
Print status | In Print |
Pages | 224 |
User level | Intermediate |
Keywords | data science, big data, python, databases, network analysis, natural language processing, machine learning, visualization |
Related titles | “Practical Programming” by Paul Gries, Jennifer Campbell, and Jason Montojo |
ISBN | 9781680501841 |
Other ISBN |
Channel epub: 9781680503388 Channel PDF: 9781680503395 Kindle: 9781680502220 Safari: 9781680502237 Kindle: 9781680502220 |
BISACs | BUS019000 BUSINESS & ECONOMICS / Decision-Making & Problem SolvingCOM051360 COMPUTERS / Programming Languages / PythonCOM051360 COMPUTERS / Programming Languages / Python |
Highlight
Go from messy, unstructured artifacts stored in SQL and NoSQL databases to a neat, well-organized dataset with this quick reference for the busy data scientist. Understand text mining, machine learning, and network analysis; process numeric data with the NumPy and Pandas modules; describe and analyze data using statistical and network-theoretical methods; and see actual examples of data analysis at work. This one-stop solution covers the essential data science you need in Python.
Description
Data science is one of the fastest-growing disciplines in terms of academic research, student enrollment, and employment. Python, with its flexibility and scalability, is quickly overtaking the R language for data-scientific projects. Keep Python data-science concepts at your fingertips with this modular, quick reference to the tools used to acquire, clean, analyze, and store data.
This one-stop solution covers essential Python, databases, network analysis, natural language processing, elements of machine learning, and visualization. Access structured and unstructured text and numeric data from local files, databases, and the Internet. Arrange, rearrange, and clean the data. Work with relational and non-relational databases, data visualization, and simple predictive analysis (regressions, clustering, and decision trees). See how typical data analysis problems are handled. And try your hand at your own solutions to a variety of medium-scale projects that are fun to work on and look good on your resume.
Keep this handy quick guide at your side whether you’re a student, an entry-level data science professional converting from R to Python, or a seasoned Python developer who doesn’t want to memorize every function and option.
Contents and Extracts
- Acknowledgments
- <b>Preface</b>
- About This Book
- About the Audience
- About the Software
- Notes on Quotes
- The Book Forum
- Your Turn
- What Is Data Science
- Data Analysis Sequence
- Data Acquisition Pipeline
- Report Structure
- Your Turn
- Core Python for Data Science <b>excerpt</b>
- Understanding Basic String Functions
- Choosing the Right Data Structure
- Comprehending Lists through List Comprehension
- Counting with Counters
- Working with Files
- Reaching the Web
- Pattern Matching with Regular Expressions
- Globbing File Names and Other Strings
- Pickling and Unpickling Data
- Your Turn
- Working with Text Data
- Processing HTML Files
- Handling CSV Files
- Reading JSON Files
- Processing Texts in Natural Languages
- Your Turn
- Working with Databases
- Setting Up a MySQL Database
- Using a MySQL Database: Command Line
- Using a MySQL Database: PyMySQL
- Taming Document Stores: MongoDB
- Your Turn
- Working with Tabular Numeric Data <b>excerpt</b>
- Creating Arrays
- Transposing and Reshaping
- Indexing and Slicing
- Broadcasting
- Demystifying Universal Functions
- Understanding Conditional Functions
- Aggregating and Ordering Arrays
- Treating Arrays as Sets
- Saving and Reading Arrays
- Generating a Synthetic Sine Wave
- Your Turn
- Working with Data Series and Frames
- Getting Used to Pandas Data Structures
- Reshaping Data
- Handling Missing Data
- Combining Data
- Ordering and Describing Data
- Transforming Data
- Taming Pandas File I/O
- Your Turn
- Working with Network Data
- Dissecting Graphs
- Network Analysis Sequence
- Harnessing Networkx
- Your Turn
- Plotting <b>excerpt</b>
- Basic Plotting with PyPlot
- Getting to Know Other Plot Types
- Mastering Embellishments
- Plotting with Pandas
- Your Turn
- Probability and Statistics
- Reviewing Probability Distributions
- Recollecting Statistical Measures
- Doing Stats the Python Way
- Your Turn
- Machine Learning
- Designing a Predictive Experiment
- Fitting a Linear Regression
- Grouping Data with k-Means Clustering
- Surviving In Random Decision Forests
- Your Turn
- Further Reading
- Solutions to Single-Star Projects