Books for scientists and data scientists

Thinking meta about data


Big Data, Little Data, No Data: Scholarship in the Networked World by Christine L. Borgman
Dense, academic language.  But, it is the perfect book for someone interested in how we got to the current state of data scholarship.  She also gives some good ideas about how to improve it.

I found the survey of data practices in different fields fascinating.  It's long.  But you have to understand how we got here, what works, what doesn't.  You can't do that without thorough research, which this book has in spades.

I thoroughly recommend this book for data professionals.  Early career researchers in all fields should also skim through this book and learn the research data norms of their particular field (lest they break the social norms and tank their career before it starts).

Glut: Mastering Information through the Ages by Alex Wright
A fascinating history of information--how to collect, preserve and catalog it.  Libraries have been destroyed by neglect, unhelpful curation, fires and sacking Visigoth/Roman/Mongol armies.  You'll also learn the origin of the "faceted" search.  A great read.

Raw Data Is an Oxymoron edited by Lisa Gitelman
IMHO, required reading for all graduate students.

Doing Data Science: Straight Talk from the Frontline by Cathy O'Neil and Rachel Schutt
The most hands-on of the meta books or the most meta of the hands-on books? Not many introductory books include a chapter on ethics but more should.

Hands-on How-tos


A Hands-On Introduction to Using Python in the Atmospheric and Oceanic Sciences by Johnny Lin
While a free PDF version is available, I suggest purchasing the $5 full hyperlinked PDF version or the $20 hardcover book (includes PDF download). If you were to read one book, this is the one to help you get started.

Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython By Wes McKinney
More general and broad. Doesn't cover common data formats encountered in the atmospheric, oceanic and climate research communities.

R Cookbook By Paul Teetor
Useful data recipes that you can alter to perform your tasks.  Install the rNOMADS module to read GRIB datasets into R.

No comments:

Post a Comment

This section is for people who want to discuss using our data holdings effectively. Moderators will delete irrelevant comments.