🏠Home
📬 Posts
đź”– Bookmarks
Skills for ML Engineering Posted on
Mar 3, 2022
1. Getting ready
Do a lot of leetcode problems to understand data structures and algorithms
Get exposure to software design patterns (e.g. event driven etc.)
Get solid CS fundamentals by reading text books (e.g. Operating Systems, Compilers, Distributed and cloud computing)
Get solid ML engineering fundamentals https://course.fast.ai/ or https://www.coursera.org/professional-certificates/tensorflow-in-practice or https://www.udacity.com/course/deep-learning-pytorch—ud188 .
2. Learn from Google
Google is an industry leader in ML and surrounding best practices. What they have to say about the topic is useful.
Do the crash course - https://developers.google.com/machine-learning/crash-course
Read the People+AI guidebook - https://developers.google.com/machine-learning/guides
Read Rules of ML - https://developers.google.com/machine-learning/guides/rules-of-ml
Read the other guides - https://developers.google.com/machine-learning/guides
Try the TensorFlow quickstart in Google Colab - https://www.tensorflow.org/tutorials/quickstart/beginner
Read the TFX Guide and maybe try it out - https://www.tensorflow.org/tfx/guide
3. Write a pipeline
If you have access to an AWS account, run a complete SageMaker pipeline https://github.com/aws/amazon-sagemaker-examples/tree/main/sagemaker-pipelines/nlp/amazon_comprehend_sagemaker_pipeline and go around your AWS console seeing and inspecting all the resources it created
If you have access to a Google Cloud account, run a Vertex AI pipeline https://cloud.google.com/vertex-ai/docs/pipelines/build-pipeline
If you have Azure, then https://docs.microsoft.com/en-us/azure/machine-learning/how-to-create-machine-learning-pipelines
If you don’t have access to any account already, poke around the internet if you can figure out if any of them offer things in free tier (it’s also a skill to see if a vendor provides something in free tier and use that I guess ;)). If not, run a Kubeflow pipeline https://www.kubeflow.org/docs/components/pipelines/introduction/ , or build an Airflow pipeline https://www.google.com/search?q=pytorch+airflow
4. Research & explore the web
Just surf the internet, maybe start with https://github.com/visenger/awesome-mlops and check out what others are up to, especially companies like Uber, AirBnB, Lyft, Shopify, Netflix, etc.
FBLearner flow, particularly, a post from Facebook all the way back from 2016 is still as awesome as ever and is a brilliant design pattern introduction for doing ML at a big company https://engineering.fb.com/2016/05/09/core-data/introducing-fblearner-flow-facebook-s-ai-backbone/ .
5. Andrew Ng
Do the Andrew Ng course at https://www.coursera.org/specializations/machine-learning-engineering-for-production-mlops . Just do it.
6. Talks and conferences
Develop a method to search the Internet and find people speaking about the topic. A lot of people speak about ML these days in different forums, events and conferences.
Stanford MLSys seminars https://www.youtube.com/playlist?list=PLSrTvUm384I9PV10koj_cqit9OfbJXEkq
DeepLearning AI channel https://www.youtube.com/watch?v=Ta14KpeZJok
AWS re:Invent 2021 keynote https://www.youtube.com/watch?v=ue9aumC7AAk
A podcast I found on InfoQ’s channel https://www.youtube.com/watch?v=aPbQMx7zuOM
The awesome-mlops index again https://github.com/visenger/awesome-mlops#talks-about-mlops
My own talk at AWS re:Invent from 2019 https://youtu.be/fgLlEnhFZQA?t=1200
Discuss this post on HN