Writing a book on NLP is a bit like solving a complex data science project

An interview with Lewis Tunstall, co-author of the book- Natural Language Processing with Transformers

Feb 6, 2023

8 min read

Photo courtesy of Lewis Tunstall

A series of interviews highlighting the incredible work of writers in the space of data science and their path of writing.

"In fiction, the language and the senses it evokes are important, whereas in technical writing, the content, and the information it conveys, are important." ― Krista Van Laan, The Insider’s Guide to Technical Writing

Last edited on Feb 6, 2023

Being a writer myself, I have a keen interest in uncovering the narratives behind the books we read, especially in the machine learning realm. These writers possess an uncanny ability to translate the complexities of AI into words that are both informative and interesting is truly remarkable. It is my goal, through a series of interviews, to bring their stories to the forefront and shed light on the story of some of the well-known authors in the field of Artificial Intelligence.

Meet the Author: Lewis Tunstall

Lewis Tunstall is an accomplished machine learning engineer currently working at Hugging Face. He has extensive experience in building machine learning applications for startups and enterprises, with a focus on the areas of NLP, topological data analysis, and time series. With a PhD in theoretical physics, Lewis has had the opportunity to hold research positions in various countries, including Australia, the USA, and Switzerland. His current work focuses on developing innovative tools for the NLP community and empowering individuals with the knowledge and skills to use them effectively.

Lewis is the co-author of the book -" Natural Language Processing with Transformers" along with Leandro von Werra and Thomas Wolf. The book is a comprehensive guide to the latest advancements in the field of NLP and is a great resource for anyone looking to gain a deeper understanding of NLP and how it can be applied to real-world problems.

Natural Language Processing with Transformers, Revised Edition

Q: How did the idea of this book originate?

Lewis: Although we began the book in 2020, its origin story really began in 2019 when Leandro and I first started working with Transformer models. At the time, Jay Alammar’s amazing blog posts and The Annotated Transformer by Sasha Rush were among the few written resources available to understand how these models work. These articles were (and are!) great for developing understanding, but we felt there was a gap in guiding people on how to apply Transformers to industrial use cases. So in 2020, we had the somewhat foolhardy idea to combine the knowledge we’d learned from our jobs as a book. My wife suggested that we contact Thomas to see if he’d be interested in being a co-author, and to our great surprise, he agreed!

Q: Could you summarize the main points covered in the book for the readers?

Lewis: As you might expect from the title, this book is about applying Transformer models to NLP tasks. Most chapters are structured around a single use case you’re likely to encounter in the industry. The book covers core applications such as text classification, named entity recognition, and question answering. We take a lot of inspiration from the fantastic fast.ai course (which is how I got started with deep learning!), so the book is written in a hands-on style, emphasising solving real-world problems with code. In the early chapters, we introduce the concepts of self-attention and transfer learning, which underpins the success of Transformers.

The main advice I’d suggest to new writers is to find co-authors or colleagues who can deeply critique your ideas and writing.

The latter part of the book dives into more advanced topics, such as optimising Transformers for production environments and handling scenarios where you have little labelled data (i.e. every data scientist’s nightmare 😃 . One of my favourite chapters is about training a GPT-2 scale model from scratch, including how to create a large-scale corpus and train on distributed infrastructure! The book concludes with an eye towards the future by highlighting some of the exciting recent developments involving Transformers and other modalities like images and audio.

Q: You co-authored the book with Leandro von Werra and Thomas Wolf. How is it to author a book with multiple writers?

Writing a book on NLP is a bit like solving a complex data science project. Among various challenges, you need to design the use case and story for each chapter, find appropriate data and models, and make sure that you can keep code complexity to a minimum because reading long blocks of code is no fun at all. And just like any data science project, some chapters involved running dozens of experiments. For all these challenges, I found writing the book with Leandro and Thomas to be an extremely valuable experience! In particular, having co-authors with whom you can brainstorm ideas or sanity-check your code was especially helpful. Of course, one challenge that arises with multiple authors is trying to keep the same writing style throughout the book, and we were lucky to have great editors at O’Reilly to help us achieve this.

Q: Who do you think is the target audience for the book?

We wrote this book for data scientists and machine learning/software engineers who may have heard about the recent breakthroughs involving Transformers but still need an in-depth guide to help them adapt these models to their own use cases. In other words, people like myself about 1–2 years ago! We envision our book will be most helpful to industry practitioners or those hoping to break into NLP from a nearby field (e.g. a software engineer who wants to build machine learning-powered applications).

Q: What, according to you, is the best way to make the most out of this book- read first and code later or code along?

We actually wrote the whole book using Jupyter notebooks and a tool called fastdoc, so every line of code can be executed in the accompanying notebooks that we provide on GitHub. I suggest having the book chapter and accompanying code side-by-side, so you can quickly experiment with the inputs and outputs while you’re reading. We are also planning some live events around the book’s release, so stay tuned on the book’s website to find out when these events will happen.

Q: What advice would you give a new writer, someone just starting?

Hehe, my first piece of advice would be to ask whether you’re really sure you want to write a book. My second piece of advice would be don’t write a book just after you’ve had a baby … in the middle of a pandemic 😃

Writing a book on NLP is a bit like solving a complex data science project. Among various challenges, you need to design the use case and story for each chapter, find appropriate data and models, and make sure that you can keep code complexity to a minimum because reading long blocks of code is no fun at all.

Jokes aside, the main advice I’d suggest to new writers is to find co-authors or colleagues who can deeply critique your ideas and writing. We were fortunate to have experienced writers like Aurélien Geron and Hamel Husain provide us with detailed comments on our drafts, and their feedback significantly improved the book. This type of feedback is invaluable because, as a writer, you sometimes forget which aspects were challenging to master the first time you learned a subject. In a curious twist of fate, Aurélien’s written the foreword to our book!

Q: Who is your favourite book and author (in technical or non-technical space)?

Ooh, this is a tricky question! I’m an avid reader of science fiction, and The Three-Body Problem by Liu Cixin is one of my favourite series because it provides a really novel solution to the Fermi paradox – i.e. why is there no observable evidence of alien civilisations, given the large number of stars with Earth-like planets? I also enjoy reading books and biographies about the history of scientific fields like physics and computer science. One recent favourite of mine is The Man from the Future by Ananyo Bhattacharya, which is about the life of the extraordinary John von Neumann. Although I had encountered some of von Neumann’s work as a physics student (he wrote a masterpiece on quantum mechanics), I was not fully aware of how vast his contributions were to mathematics and science at large. The book does a wonderful job of describing his work and the fascinating history around it – highly recommended!

👉 Are you looking forward to connecting with Lewis? Follow him on Twitter.

👉 Read other interviews in this series:

Don’t just take notes – turn them into articles and share them with others

You do not become better by employing fancy techniques but by working on the fundamentals

Publishing Is Powerful as It Serves as a Catalyst for Scope and Writing Decisions

👁 Image

Written By

Parul Pandey

See all from Parul Pandey

Artificial Intelligence, Data Science, Editor’s Picks, Machine Learning, Naturallanguageprocessing

Share This Article