Interested in Machine Learning? You are not alone! More people are getting interested in Machine Learning every day. In fact, you’d be hard pressed to find a field generating more buzz these days than this one.  Machine Learning’s inroads into our collective consciousness have been both history making (as when AlphaGo won 4 of 5 Go matches against the world’s best Go player!) and hysterical (Machine Learning Algorithm Identifies Tweets Sent Under The Influence Of Alcohol), but regardless how you discovered it, one thing is clear: Machine Learning has arrived.

That said, it’s one thing to get interested in Machine Learning, it’s another thing altogether to actually start working in the field. This post will help you understand both the overall mindset and the specific skills you’ll need to start working as a Machine Learning engineer.

To begin, there are two very important things that you should understand if you’re considering a career as a Machine Learning engineer. First, it’s not a “pure” academic role. You don’t necessarily have to have a research or academic background. Second, it’s not enough to have either software engineering or data science experience. You ideally need both.

Data Analyst vs. Machine Learning Engineer

It’s also critical to understand the differences between a Data Analyst and a Machine Learning engineer. In simplest form, the key distinction has to do with the end goal. As a Data Analyst, you’re analyzing data in order to tell a story, and to produce actionable insights. The emphasis is on dissemination—charts, models, visualizations. The analysis is performed and presented by human beings, to other human beings who may then go on to make business decisions based on what’s been presented. This is especially important to note—the “audience” for your output is human. As a Machine Learning engineer, on the other hand, your final “output” is working software (not the analyses or visualizations that you may have to create along the way), and your “audience” for this output often consists of other software components that run autonomously with minimal human supervision. The intelligence is still meant to be actionable, but in the Machine Learning model, the decisions are being made by machines and they affect how a product or service behaves. This is why the software engineering skill set is so important to a career in Machine Learning.

Understanding The Ecosystem

Before getting into specific skills, there is one more concept to address. Being a Machine Learning engineer necessitates understanding the entire ecosystem that you’re designing for.

Let’s say you’re working for a grocery chain, and the company wants to start issuing targeted coupons based on things like the past purchase history of customers, with a goal of generating coupons that shoppers will actually use. In a Data Analysis model, you could collect the purchase data, do the analysis to figure out trends, and then propose strategies. The Machine Learning approach would be to write an automated coupon generation system. But what does it take to write that system, and have it work? You have to understand the whole ecosystem—inventory, catalog, pricing, purchase orders, bill generation, Point of Sale software, CRM software, etc.

Ultimately, the process is less about understanding Machine Learning algorithms—or when and how to apply them—and more about understanding the systemic interrelationships, and writing working software that will successfully integrate and interface. Remember, Machine Learning output is actually working software!

Now, let’s get into the real details of what it takes to be a Machine Learning engineer. We’re going to break this into two primary sections: Summary of Skills, and Languages and Libraries. We’ll begin with the Summary of Skills here, then in a follow up post we’ll address Languages and Libraries for Machine Learning.

Please subscribe to our blog to receive our follow up post on Languages and Libraries for Machine Learning in your inbox!

Summary of Skills

1. Computer Science Fundamentals and Programming

Computer science fundamentals important for Machine Learning engineers include data structures (stacks, queues, multi-dimensional arrays, trees, graphs, etc.), algorithms (searching, sorting, optimization, dynamic programming, etc.), computability and complexity (P vs. NP, NP-complete problems, big-O notation, approximate algorithms, etc.), and computer architecture (memory, cache, bandwidth, deadlocks, distributed processing, etc.).

You must be able to apply, implement, adapt or address them (as appropriate) when programming. Practice problems, coding competitions and hackathons are a great way to hone your skills.

2. Probability and Statistics

A formal characterization of probability (conditional probability, Bayes rule, likelihood, independence, etc.) and techniques derived from it (Bayes Nets, Markov Decision Processes, Hidden Markov Models, etc.) are at the heart of many Machine Learning algorithms; these are a means to deal with uncertainty in the real world. Closely related to this is the field of statistics, which provides various measures (mean, median, variance, etc.), distributions (uniform, normal, binomial, Poisson, etc.) and analysis methods (ANOVA, hypothesis testing, etc.) that are necessary for building and validating models from observed data. Many Machine Learning algorithms are essentially extensions of statistical modeling procedures.

3. Data Modeling and Evaluation

Data modeling is the process of estimating the underlying structure of a given dataset, with the goal of finding useful patterns (correlations, clusters, eigenvectors, etc.) and/or predicting properties of previously unseen instances (classification, regression, anomaly detection, etc.). A key part of this estimation process is continually evaluating how good a given model is. Depending on the task at hand, you will need to choose an appropriate accuracy/error measure (e.g. log-loss for classification, sum-of-squared-errors for regression, etc.) and an evaluation strategy (training-testing split, sequential vs. randomized cross-validation, etc.). Iterative learning algorithms often directly utilize resulting errors to tweak the model (e.g. backpropagation for neural networks), so understanding these measures is very important even for just applying standard algorithms.

4. Applying Machine Learning Algorithms and Libraries

Standard implementations of Machine Learning algorithms are widely available through libraries/packages/APIs (e.g. scikit-learn, Theano, Spark MLlib, H2O, TensorFlow etc.), but applying them effectively involves choosing a suitable model (decision tree, nearest neighbor, neural net, support vector machine, ensemble of multiple models, etc.), a learning procedure to fit the data (linear regression, gradient descent, genetic algorithms, bagging, boosting, and other model-specific methods), as well as understanding how hyperparameters affect learning. You also need to be aware of the relative advantages and disadvantages of different approaches, and the numerous gotchas that can trip you (bias and variance, overfitting and underfitting, missing data, data leakage, etc.). Data science and Machine Learning challenges such as those on Kaggle are a great way to get exposed to different kinds of problems and their nuances.

5. Software Engineering and System Design

At the end of the day, a Machine Learning engineer’s typical output or deliverable is software. And often it is a small component that fits into a larger ecosystem of products and services. You need to understand how these different pieces work together, communicate with them (using library calls, REST APIs, database queries, etc.) and build appropriate interfaces for your component that others will depend on. Careful system design may be necessary to avoid bottlenecks and let your algorithms scale well with increasing volumes of data. Software engineering best practices (including requirements analysis, system design, modularity, version control, testing, documentation, etc.) are invaluable for productivity, collaboration, quality and maintainability.

Machine Learning Job Roles

Jobs related to Machine Learning are growing rapidly as companies try to get the most out of emerging technologies. The chart below depicts the relative importance of core skills for these general types of roles, with a typical Data Analyst role for comparison.

ML Graph

Relative importance of core skills for different Machine Learning job roles (click to enlarge)

The Future of Machine Learning

What is perhaps most compelling about Machine Learning is its seemingly limitless applicability. There are already so many fields being impacted by Machine Learning, including education, finance, computer science, and more. There are also virtually NO fields to which Machine Learning doesn’t apply. In some cases, Machine Learning techniques are in fact desperately needed. Healthcare is an obvious example. Machine Learning techniques are already being applied to critical arenas within the Healthcare sphere, impacting everything from care variation reduction efforts to medical scan analysis. David Sontag, an assistant professor at New York University’s Courant Institute of Mathematical Sciences and NYU’s Center for Data Science, recently gave a talk on Machine Learning and the Healthcare system, in which he discussed “how machine learning has the potential to change health care across the industry, from enabling the next-generation electronic health record to population-level risk stratification from health insurance claims.”

The world is unquestionably changing in rapid and dramatic ways, and the demand for Machine Learning engineers is going to keep increasing exponentially. The world’s challenges are complex, and they will require complex systems to solve them. Machine Learning engineers are building these systems. If this is YOUR future, then there’s no time like the present to start mastering the skills and developing the mindset you’re going to need to succeed.