Type4Py is an ML-based type auto-completion for Python. It assists developers to gradually add type annotations to their codebases and benefit from the advantages of static typing such as better code completion, early bug detection, program repair, and type inference.
Roadmap
Future work
VSCode Extension
The extension sends an opened Python source file to the server and receives a JSON response.
- Type slots are functions parameters, return types, and variables, which are located based on the line and column numbers.
- Currently, type prediction can be triggered via Command Pallete or by enabling the AutoInfer setting.
Releasing
The development environment is used to test, debug, and profile Type4Py’s server-side components before releasing new features/fixes into the production code
- Users will not be affected by new changes and features in the production environment
- Whenever they train a new neural model, they test it against its evaluation metrics and run integration tests to ensure it produces expected predctions
Kubernetes
Deploy ML applications using K8s
- All you need to do is create a deployment and a service
- A deployment file contains a path to pull your ML app’s container or image and required CLI args or environments variables if any
- The service exposes your ML application to the outside and also load-balances incoming traffic among the pods
- With a deployment file, deploying and scaling is basically knowing several simple commands
Containerzing ML applications
Containerization is packaging an application with all its dependencies and required run-time libraries
- It allows you to run your application in isolation, i.e., a “container”, on different environments and operating systems
- Create a Dockerfile that has a base image and installs all the necessary dependencies that your application needs
- For deployment, there are generally two options
- Ship your containerized ML application with a trained model to the users directly
- Alternatively, you can purchase a domain for your application and make your ML application accessible to the world
Dataset
The ManyTypes4Py dataset contains 5.2K Python projects and 4.2M type annotations.
Feature Extraction
Extract three kinds of type hints: identifiers, code context, and visible type hints (VTHs)
- Identifiers: function name, parameters, and variables
- Code context: usage of parameters and variables in statements that they are used
- VTHs: deep recursive analysis of import statements in a file and its transitive dependencies in our dataset
- Build a VTH vocabulary to give a hint to the model about the expected type
Implementation
The server-side components are all written in Python
- To extract type hints, we first extract Abstract Syntax Trees (ASTs) and perform light-weight static analysis using our LibSA4Py package.
- NLP tasks are applied using NLTK
- For the Type4Py model, we use bidirectional LSTMs in PyTorch to implement the two RNNs. To minimize the value of the Triplet loss function, we employ the Adam optimizer
Overview
There is a VSCode extension at the client-side (developers) and the Type4Py model and its pipeline are deployed on our servers.
- Simply, the extension sends a Python file to the server and a JSON response containing type predictions for the given file is returned.
Monitoring ML applications
Use Prometheus to monitor your application.
Wrapping Up
The hope that this post was useful and you can now deploy your first ML model somewhere so that people can try it.
- Your model does not have to be a giant deep learning model. It can be a small classic model that solves an important or interesting problem.
Acknowledgments
The Type4Py model, its pipeline, and VSCode extension are all designed and developed at the Software Analytics Lab of SERG, Delft University of Technology.
Deploying an ML model
High-level steps to deploy
- Export your pre-trained model to the ONNX format
- Create a small Rest API application to query your model.
- Containerize your ML application using Docker
- Deploy and scale it with Kubernetes
- Monitor your application
Deployment
To deploy the Type4Py model for the production environment, we convert the pre-trained PyTorch model to an ONNX model which allows us to query the model on both GPUs and CPUs with very fast inference speed and lower VRAM consumption.
- Thanks to Annoy, Type Clusters are memory-mapped into RAM from disk, which consumes less memory.
Model Architecture & Training
The Type4Py model consists of two RNNs with LSTM units, one for identifiers and another for code context, which are concatenated into a single vector, which is passed through a fully-connected linear layer, and the final linear layer maps the learned type annotation into a high-dimensional feature space, called Type Clusters.
- To infer a type for a given query, we perform k-nearest neighbor (KNN) search in the type clusters to suggest a list of similar types
Exporting ML models
Oftentimes, ML frameworks are optimized to speed up the model training, not prediction/inference
- It is highly recommended to export or convert your trained model to the ONNX format
- An open standard to greatly accelerate the ML model’s inference and also enables querying models on various systems such as cell phones, embedded systems, workstations, etc.
Creating a REST API to query a model
You need to create a tiny REST API with a prediction endpoint.
- Name the prediction endpoint “/predict”.
- This endpoint can accept a raw input/sample, pre-process, extract features, query the trained model, and finally return predictions as a response.
Next steps
To know more, there is a course by Andrew Ng on ML engineering for production, which you can take