Giới thiệu sách Python: Mastering Python Data Visualization

Mastering Python Datavisualization

Mastering Python Data Visualization

Kirthi Raman

 

Table of Contents
Preface vii
Chapter 1: A Conceptual Framework for Data Visualization 1
Data, information, knowledge, and insight 2
Data 2
Information 3
Knowledge 4
Data analysis and insight 5
The transformation of data 5
Transforming data into information 6
Data collection 6
Data preprocessing 7
Data processing 8
Organizing data 8
Getting datasets 9
Transforming information into knowledge 9
Transforming knowledge into insight 10
Data visualization history 11
Visualization before computers 12
Minard's Russian campaign (1812) 12
The Cholera epidemics in London (1831-1855) 13
Statistical graphics (1850-1915) 13
Later developments in data visualization 14
How does visualization help decision-making? 15
Where does visualization ft in? 16
Data visualization today 17
What is a good visualization? 18
Visualization plots 21
Bar graphs and pie charts 26
Bar graphs 26
Pie charts 28
Box plots 30
Scatter plots and bubble charts 31
Scatter plots 31
Bubble charts 33
KDE plots 36
Summary 39

Chapter 2: Data Analysis and Visualization 41
Why does visualization require planning? 42
The Ebola example 43
A sports example 49
Visually representing the results 52
Creating interesting stories with data 62
Why are stories so important? 62
Reader-driven narratives 62
Gapminder 63
The State of the Union address 64
Mortality rate in the USA 65
A few other example narratives 69
Author-driven narratives 70
Perception and presentation methods 72
The Gestalt principles of perception 73
Some best practices for visualization 75
Comparison and ranking 76
Correlation 76
Distribution 78
Location-specifc or geodata 80
Part-to-whole relationships 81
Trends over time 82
Visualization tools in Python 82
Development tools 83
Canopy from Enthought 83
Anaconda from Continuum Analytics 84
Interactive visualization 85
Event listeners 85
Layouts 86
Circular layout 87
Radial layout 88
Balloon layout 89
Summary 90

Chapter 3: Getting Started with the Python IDE 91
The IDE tools in Python 92
Python 3.x versus Python 2.7 92
Types of interactive tools 92
IPython 93
Plotly 94
Types of Python IDE 95
PyCharm 96
PyDev 97
Interactive Editor for Python (IEP) 98
Canopy from Enthought 100
Anaconda from Continuum Analytics 104
Visualization plots with Anaconda 109
The surface-3D plot 110
The square map plot 112
Interactive visualization packages 116
Bokeh 117
VisPy 118
Summary 119

Chapter 4: Numerical Computing and Interactive Plotting 121
NumPy, SciPy, and MKL functions 122
NumPy 122
NumPy universal functions 122
Shape and reshape manipulation 124
An example of interpolation 125
Vectorizing functions 126
Summary of NumPy linear algebra 128
SciPy 129
An example of linear equations 133
The vectorized numerical derivative 134
MKL functions 136
The performance of Python 137
Scalar selection 138
Slicing 139
Slice using flat 140
Array indexing 140
Numerical indexing 141
Logical indexing 142
Other data structures 143
Stacks 143
Tuples 144
Sets 145
Queues 146
Dictionaries 146
Dictionaries for matrix representation 148
Sparse matrices 149
Dictionaries for memoization 152
Tries 153
Visualization using matplotlib 155
Word clouds 156
Installing word clouds 156
Input for word clouds 159
Web feeds 159
The Twitter text 161
Plotting the stock price chart 164
Obtaining data 164
The visualization example in sports 173
Summary 177

Chapter 5: Financial and Statistical Models 179
The deterministic model 180
Gross returns 180
The stochastic model 191
Monte Carlo simulation 191
What exactly is Monte Carlo simulation? 191
An inventory problem in Monte Carlo simulation 192
Monte Carlo simulation in basketball 196
The volatility plot 202
Implied volatilities 207
The portfolio valuation 211
The simulation model 214
Geometric Brownian simulation 214
The diffusion-based simulation 218
The threshold model 221
Schelling's Segregation Model 221
An overview of statistical and machine learning 225
K-nearest neighbors 226
Generalized linear models 228
Bayesian linear regression 228
Creating animated and interactive plots 231
Summary 236

Chapter 6: Statistical and Machine Learning 237
Classifcation methods 238
Understanding linear regression 239
Linear regression 242
Decision tree 246
An example 246
The Bayes theorem 251
The Naïve Bayes classifer 252
The Naïve Bayes classifer using TextBlob 254
Installing TextBlob 254
Downloading corpora 254
The Naïve Bayes classifer using TextBlob 255
Viewing positive sentiments using word clouds 259
k-nearest neighbors 261
Logistic regression 265
Support vector machines 269
Principal component analysis 271
Installing scikit-learn 275
k-means clustering 276
Summary 280

Chapter 7: Bioinformatics, Genetics, and Network Models 281
Directed graphs and multigraphs 282
Storing graph data 283
Displaying graphs 284
igraph 284
NetworkX 287
Graph-tool 293
The clustering coeffcient of graphs 294
Analysis of social networks 298
The planar graph test 300
The directed acyclic graph test 302
Maximum flow and minimum cut 304
A genetic programming example 306
Stochastic block models 308
Summary 313

Chapter 8: Advanced Visualization 315
Computer simulation 316
Python's random package 317
SciPy's random functions 317
Simulation examples 319
Signal processing 322
Animation 326
Visualization methods using HTML5 328
How is Julia different from Python? 332
D3.js for visualization 333
Dashboards 334
Summary 336

Appendix: Go Forth and Explore Visualization 337
An overview of conda 338
Packages installed with Anaconda 342
Packages websites 343
About matplotlib 344
Index 34