Skills for ML Engineering
1. Getting ready
- Do a lot of leetcode problems to understand data structures and algorithms
- Get exposure to software design patterns (e.g. event driven etc.)
- Get solid CS fundamentals by reading text books (e.g. Operating Systems, Compilers, Distributed and cloud computing)
- Get solid ML engineering fundamentals https://course.fast.ai/ or https://www.coursera.org/professional-certificates/tensorflow-in-practice or https://www.udacity.com/course/deep-learning-pytorch—ud188.
2. Learn from Google
Google is an industry leader in ML and surrounding best practices. What they have to say about the topic is useful.
- Do the crash course - https://developers.google.com/machine-learning/crash-course
- Read the People+AI guidebook - https://developers.google.com/machine-learning/guides
- Read Rules of ML - https://developers.google.com/machine-learning/guides/rules-of-ml
- Read the other guides - https://developers.google.com/machine-learning/guides
- Try the TensorFlow quickstart in Google Colab - https://www.tensorflow.org/tutorials/quickstart/beginner
- Read the TFX Guide and maybe try it out - https://www.tensorflow.org/tfx/guide
3. Write a pipeline
- If you have access to an AWS account, run a complete SageMaker pipeline https://github.com/aws/amazon-sagemaker-examples/tree/main/sagemaker-pipelines/nlp/amazon_comprehend_sagemaker_pipeline and go around your AWS console seeing and inspecting all the resources it created
- If you have access to a Google Cloud account, run a Vertex AI pipeline https://cloud.google.com/vertex-ai/docs/pipelines/build-pipeline
- If you have Azure, then https://docs.microsoft.com/en-us/azure/machine-learning/how-to-create-machine-learning-pipelines
- If you don’t have access to any account already, poke around the internet if you can figure out if any of them offer things in free tier (it’s also a skill to see if a vendor provides something in free tier and use that I guess ;)). If not, run a Kubeflow pipeline https://www.kubeflow.org/docs/components/pipelines/introduction/, or build an Airflow pipeline https://www.google.com/search?q=pytorch+airflow
4. Research & explore the web
Just surf the internet, maybe start with https://github.com/visenger/awesome-mlops and check out what others are up to, especially companies like Uber, AirBnB, Lyft, Shopify, Netflix, etc.
FBLearner flow, particularly, a post from Facebook all the way back from 2016 is still as awesome as ever and is a brilliant design pattern introduction for doing ML at a big company https://engineering.fb.com/2016/05/09/core-data/introducing-fblearner-flow-facebook-s-ai-backbone/.
5. Andrew Ng
Do the Andrew Ng course at https://www.coursera.org/specializations/machine-learning-engineering-for-production-mlops. Just do it.
6. Talks and conferences
Develop a method to search the Internet and find people speaking about the topic. A lot of people speak about ML these days in different forums, events and conferences.
- Stanford MLSys seminars https://www.youtube.com/playlist?list=PLSrTvUm384I9PV10koj_cqit9OfbJXEkq
- DeepLearning AI channel https://www.youtube.com/watch?v=Ta14KpeZJok
- AWS re:Invent 2021 keynote https://www.youtube.com/watch?v=ue9aumC7AAk
- A podcast I found on InfoQ’s channel https://www.youtube.com/watch?v=aPbQMx7zuOM
- The awesome-mlops index again https://github.com/visenger/awesome-mlops#talks-about-mlops
- My own talk at AWS re:Invent from 2019 https://youtu.be/fgLlEnhFZQA?t=1200
Discuss this post on HN