In this Book

Machine Learning for Data Streams: with Practical Examples in MOA

Book
2018
Published by: The MIT Press
summary

A hands-on approach to tasks and techniques in data stream mining and real-time analytics, with examples in MOA, a popular freely available open-source software framework.

Today many information sources—including sensor networks, financial markets, social networks, and healthcare monitoring—are so-called data streams, arriving sequentially and at high speed. Analysis must take place in real time, with partial data and without the capacity to store the entire data set. This book presents algorithms and techniques used in data stream mining and real-time analytics. Taking a hands-on approach, the book demonstrates the techniques using MOA (Massive Online Analysis), a popular, freely available open-source software framework, allowing readers to try out the techniques after reading the explanations.

The book first offers a brief introduction to the topic, covering big data mining, basic methodologies for mining data streams, and a simple example of MOA. More detailed discussions follow, with chapters on sketching techniques, change, classification, ensemble methods, regression, clustering, and frequent pattern mining. Most of these chapters include exercises, an MOA-based lab session, or both. Finally, the book discusses the MOA software, covering the MOA graphical user interface, the command line, use of its API, and the development of new methods within MOA. The book will be an essential reference for readers who want to use data stream mining as a tool, researchers in innovation or data stream mining, and programmers who want to create new algorithms for MOA.

Table of Contents

Cover

Series Page

Title Page

Copyright Page

Table of Contents

pp. v-xi

List of Figures

pp. xiii-xv

List of Tables

Preface

pp. xix-xxi

I: Introduction

1. Introduction

pp. 13-20

2. Big Data Stream Mining

pp. 21-29

3. Hands-on Introduction to MOA

pp. 31-41

II: Stream Mining

4. Streams and Sketches

pp. 45-75

5. Dealing with Change

pp. 77-93

6. Classification

pp. 95-137

7. Ensemble Methods

pp. 139-151

8. Regression

pp. 153-158

9. Clustering

pp. 159-173

10. Frequent Pattern Mining

pp. 175-193

III: The MOA Software

11. Introduction to MOA and Its Ecosystem

pp. 197-210

12. The Graphical User Interface

pp. 211-225

13. Using the Command Line

pp. 227-230

14. Using the API

pp. 231-236

15. Developing New Methods in MOA

pp. 237-248

Bibliography

pp. 249-265

Index

pp. 267-272

Series List

pp. 273-274
Back To Top