Benjamin is an experienced data scientist and software developer who is currently a Ph.D. candidate of Computer Science at The University of Maryland at College Park, working on applied machine learning and distributed systems. His background includes significant professional and military experience beyond his academic work, and he has been a professional Python software developer for the past eleven years and a professional data scientist since he earned his Master of Science, Computer Science from North Dakota State University in 2008. In addition to being a research assistant at Maryland, Benjamin is an adjunct faculty at Georgetown University teaching Machine Learning in the Data Science certificate program. He is an emeritus board member of Data Community DC, and a faculty member of District Data Labs. In his role at District Data Labs, he collaborates with local researchers on inclusive, high impact open source software development. He is one of the core developers of Scikit-Yellowbrick, a visual steering library for machine learning with Scikit-Learn. His main research interests include distributed storage systems, natural language processing, machine and statistical learning, distributed computation, and multi-agent systems.
Benjamin has published conference papers in the IEEE ICDCS, SSCI, and WCNC conferences as well as several O'Reilly books and talks. His primary publication topics are big data, distributed analytics, graph analytics and natural language processing. His book publications include The Practical Data Science Cookbook (Packt), Data Analytics with Hadoop: An Introduction for Data Scientists (O'Reilly) and the forthcoming Applied Text Analytics with Python (O'Reilly). Benjamin also has given several recent talks at Strata + Hadoop World, PyCon, Data Day Seattle, PyData Carolinas, and PyData DC.