Artificial intelligence is advancing rapidly, though still not comparable to human intelligence.
AI can do amazing things when used to solve problems framed the right way. I’m writing this blog to make a deeper understanding of the capabilities of AI more accessible, and to use this shared understanding to explore the societal implications of AI: cybercrime, data privacy, and public policy.
A common misconception about AI is the expectation of an impending singularity, where machine intelligence will surpass human intelligence and threaten the future of civilization. By describing what this technology is actually capable of, the areas where it is advancing and not advancing, and the principles that describe its limits, we can set aside these fears and have a reasonable conversation about the technology that, more than any other, is shaping our modern world.
Previously, I’ve led data science on the News Feed Experience team at Facebook, created AI tools to enable government intelligence analysts to more effectively deal with the massive amount of data they work with, and I spent over a decade at Microsoft where I led applied machine learning teams protecting users from large-scale Internet abuse. I also worked at Grammarly, an AI startup, where I led a team of applied researchers, machine learning engineers, and computational linguists with the mission of helping people communicate more effectively. I received a Ph.D. in computational linguistics from the University of Washington.
Outside of my work, my wife and I take delight in watching our two young children learn and grow. While this takes most of my time, I also love playing soccer, scuba diving in the tropics, traveling to Italy, and watching sprint car races.
The above summary highlights my background and what this blog is about. This section is a more detailed account of my background, particularly highlighting the problems I’ve worked on and an explanation of my path to where I’m at today in the field of AI.
I currently work at Twitter, supporting the teams responsible for Twitter’s knowledge graph. This includes an infrastructure team responsible for a platform used within Twitter wherever external data or knowledge about the world can improve the experience. This is widely used across the company, enabling features such as: enabling users to follow topics, protecting Twitter users from abuse, improving ad targeting, and providing real-time sports scores.
My organization also includes a machine learning team responsible for expanding and improving the knowledge graph, as well as using models built on top of the knowledge graph to help partner teams across the company improve their systems.
I most recently worked at Grammarly, where I led teams focused on using linguistic insights, data, and statistical methods to identify patterns in language data that can be used to help people improve their communication.
Grammarly provides a writing assistant that helps not just with detecting grammatical mistakes, but increasingly also helps with higher-order ways of improving communication.
This includes things like improving the clarity of sentences or detecting the tone of writing to help people realize how their text may come across. Artificial intelligence is critical to enabling features such as these, and this is what my teams are responsible for.
I’m currently leading a computational linguistics team, a machine learning team, and an applied research team (always hiring!). These roles vary in the techniques and skillsets involved, but each involves working with language data and artificial intelligence.
It is an incredibly exciting time to be working on problems involving natural language, with recent revolutionary advances dramatically improving the state of the art across a wide range of language tasks.
This has opened many new opportunities for building innovative products in this space, and yet it is challenging to keep up with these advances in production systems since the capabilities of these techniques are improving so rapidly.
Prior to Grammarly, I was a Director of Product Engineering at a small natural language startup called Primer.
There are places in the economy where armies of humans read many documents and then summarize them for others to process and act on.
In the financial sector, analysts keep up with what companies in a particular sector are doing to help inform investment decisions. In the government intelligence community, an analyst may be responsible for staying up-to-date on everything happening with the economy in Iraq or terrorism in Syria.
I led the development of the Global product targeted at helping intelligence analysts understand the large quantity of data they are in charge of staying on top of, which was responsible for the vast majority of Primer’s business.
To help analysts more effectively understand the increasing quantities of information available online, we built systems to cluster similar information together to reduce the amount of data points an analyst would need to look at.
If an analyst wants to dig in more, we extracted entities such as people and locations and enabled an analyst to see all the relevant data about a particular entity in a large data set.
This was my first experience at a startup after spending many years at large tech companies. While I’d often worked on problems involving text, and much of my education had been in this area, this was my first opportunity to really focus on natural language processing problems.
At Facebook I was the Head of Analytics for the News Feed Experience team, leading a team of data scientists that included people in the Seattle office I worked out of as well as Facebook’s headquarters in Menlo Park.
News Feed is responsible for the vast majority of usage at Facebook, and my time here involved working on the core functionality of the product: what posts in News Feed look like, how the sharing and commenting experiences work, building new photo consumption experiences, improving the responsiveness of the product, tackling various ranking problems that help people see the most relevant content and suggestions, and more.
Facebook is an amazing place to work. If nothing else, the quality of the data they have to work with is incredible for anyone interested in AI, and the impact that’s possible in a product with two billion users is amazing.
I’d likely still be there today if not for bad timing. I had a long commute and wanted to spend more time with my new baby.
Prior to Facebook, I started my career at Microsoft and spent over a decade there in the artificial intelligence space. As a result, many of my formative experiences were here.
I joined Microsoft for a summer internship, fully intending to go into robotics and only taking the job because it paid twice as much as I could get at hardware companies. For a single summer in college, that sounded well worth it. And I love writing code, so it would be fun.
I learned more in those three months at Microsoft than I had in the previous two years in school. Perhaps due to my interest in robotics, I happened to get paired with a team doing early work in artificial intelligence.
Created initially out of Microsoft Research, the team I joined was among the first–maybe the first–at Microsoft shipping machine learning solutions to customers. It was responsible for a spam filter in Microsoft Outlook and later Hotmail.com (now called Outlook.com).
The people I met at Microsoft were some of the smartest I’d ever met, and the problems they were tackling were well beyond anything in my artificial intelligence textbook from school. The term “data science,” much less “machine learning engineering,” hadn’t even been conceived yet.
I loved the experience but still planned to go back to school and get a Ph.D. in robotics instead of joining Microsoft. It was only when I got back to school that I realized fully how much more I was learning at Microsoft than at school, and how brilliant the people I met there were. I decided to return full-time.
They formed a small team around me and one other person who seemed to have a knack for working with data, statistics, and machine learning systems. Most people with a computer science background knew a decent amount of math, but rarely statistics. Coming from electrical engineering and economics, we happened to have what would quickly become a useful combination of skills.
We learned from this and grew the team, ending up finding people with relevant skills and advanced degrees from a diverse array of academic backgrounds: physics, electrical engineering, experimental psychology, industrial engineering, and others.
It was like the early days of the computing industry, with a bunch of smart people from different backgrounds figuring out how to make machine learning work in practice.
Our team expanded from working on spam filtering to preventing many other forms of abuse: passwords getting stolen on phishing websites, fake accounts being created, accounts getting hacked, spam sent via instant messaging, protecting kids from inappropriate websites, preventing people from getting tricked into downloading viruses online, and more.
By the end of my time here I was a Principal Data Science Manager leading a team of generalists responsible for everything from production machine learning systems to analytics; in addition to this, I also led several teams of contractors responsible for business support processes and data labeling.
So to summarize I started out on a tiny team of two, and since machine learning wasn’t widespread yet we had to try many approaches to figure out how to scale this out into a large team solving end-to-end problems using data. This included everything from analytics to machine learning to business support processes and data labeling.
This opportunity to help frame the problems and roles gave me a unique perspective that would be useful in my startup experience later on in my career.
As far as my formal education is concerned, I approached my undergraduate studies with the goal of really understanding every major layer of computers with the intent of going into robotics, where I hoped to focus primarily on the artificial intelligence systems behind the robots, particularly how knowledge is represented.
I earned a degree in electrical engineering to understand electromagnetics, circuits, and all the basic science and technology behind computers.
I got a degree in computer engineering to understand how digital circuits can be used to represent information, perform computation on data, and enable software to work.
I obtained a degree in computer science to understand how to take full advantage of the most flexible and powerful tool ever invented: the general-purpose computer. While I specifically cared about AI applications, at the time it was difficult to focus on this before graduate school so I only skimmed the surface of it at this point.
So, I decided to attend The University of Arizona, an in-state school which was not only free to me but I would get enough money through scholarships that I wouldn’t need to work through school, enabling me to focus entirely on my studies.
I had enough credits to count as a junior when I entered college, though sequences of required classes would prevent me from graduating in less than three years. With four years fully paid for, I instead decided to challenge myself by taking as many classes and earning as many relevant degrees as I could.
In addition to the aforementioned degrees, I was close to a degree in physics as well. I learned that quantum mechanics was critical to the functioning of transistors, the most important building blocks of circuits, and so I spent time studying this area. I also spent substantial time studying math and philosophy, which I believed would help me in my pursuit of AI.
Focused on robotics, I didn’t realize that artificial intelligence would become a major force in the software industry before robotic systems became practical in everyday life. Fortunately, this extra effort on math and science helped me prepare for the technical requirements for machine learning, which required math most computer science graduates at the time weren’t exposed to.
I had originally planned to continue straight from my undergraduate studies into a Ph.D. program in robotics. With the formative internship at Microsoft that helped me realize I could learn more in industry, I ended up having a more circuitous route to a doctorate degree.
It became clear to me that machine learning was the area in AI most primed for substantial real-world impact, and I wanted to learn this beyond the practical skills I was picking up at work. I read some papers, but wanted more formal education.
I didn’t want to tweak the innards of the algorithm for a couple percent improvement. I felt the value from the field would come mostly from applying this technology. So, I started with a professional master’s degree in computational linguistics at the University of Washington while working at Microsoft. Natural language processing, roughly synonymous with computational linguistics, is another sub-discipline within AI like machine learning.
I love language, it’s very relevant to meaning representation within AI, and this enabled me to gain training on applying machine learning without spending years of research tweaking a particular algorithm.
With that program complete, I then transitioned into the computational linguistics Ph.D. program at the University of Washington, where I earned my doctorate degree.
My research in graduate school focused on a variety of areas: information extraction, relation extraction, knowledge representation, and conceptual hierarchy. I was focused on practical applications and my career in industry, so I treated this time as an opportunity to learn how to conduct academic research.
I love learning well beyond formal education. I own and have taken over a hundred of The Great Courses. I frequently study courses on Udemy, where I have about half as many courses. I’m constantly reading or listening to non-fiction audiobooks if not engaged with a course.
I’ve found that in most areas I learn outside of a dedicated, focused study program where I’m doing homework and spend a ton of effort, I don’t learn as well as I did when I’m in school.
While I do study this intensively sometimes, I more commonly treat these learning opportunities as ways of expanding my awareness so I can make connections I wouldn’t otherwise have noticed. I can then dig deeper into the handful of areas of relevance as opportunities arise.
Beyond this context expansion, there are some areas where I’ve spent extra time learning that warrant mention.
Humans encode their knowledge into encyclopedias and dictionaries. While everyone who has edited Wikipedia has some idea how to build an encyclopedia, dictionaries are more interesting. They systematically represent the concepts we use to communicate, which is relevant to the meaning representation and natural language processing sub-disciplines of AI I’m fascinated by.
In addition to reading books on the topic–The Oxford Guide to Practical Lexicography is among my ten favorite books–I’ve also participated in Lexicom, the premiere workshop for training lexicographers to build dictionaries.
On a completely separate track, working in a field where most of the value is in the form of intellectual property, I also have a substantial interest in patent law. This, and the thought of someday running my own company encouraged me to learn the US patent system.
I studied for and passed the patent bar, becoming a licensed patent agent. This enables me to practice law in the form of prosecuting patent applications as well as handling appeals and review proceedings before the US Patent and Trademark Organization.
In college, I spent some time studying philosophy since this is the field that studies the nature of knowledge and how it’s formed. In addition to this, I also graduated from a three-year philosophy program at the Objectivist Academic Center.