In this brief post, I discuss some of the trends of ML and list some of the notable recent works. The way we train SotA models is slightly different from a few years ago for the purpose of optimizing the performance: We would first build a massive (often multimodal) dataset crawled from Web and model-parallelize... Continue Reading →
Machine Learning Learning Roadmap
In this brief post, I describe a very coarse learning roadmap of ML within the range of what you can learn from lectures. Once you are beyond this level, you may want to move on to my sequel to this blog post: Current Landscape of Machine Learning, which describes which papers and external sources you... Continue Reading →
GPT-J-6B: 6B JAX-Based Transformer
Summary: We have released GPT-J-6B, 6B JAX-based (Mesh) Transformer LM (Github).GPT-J-6B performs nearly on par with 6.7B GPT-3 (or Curie) on various zero-shot down-streaming tasks.You can try out this Colab notebook or free web demo.This library also serves as an example of model parallelism with xmap on JAX. Below, we will refer to GPT-J-6B by... Continue Reading →
State-of-the-Art Image Generative Models
I have aggregated some of the SotA image generative models released recently, with short summaries, visualizations and comments. The overall development is summarized, and the future trends are speculated. Many of the statements and the results here are easily applicable to other non-textual modalities, such as audio and video.
Some Notable Recent ML Papers and Future Trends
I have aggregated some of the notable papers released recently, esp. ICLR 2021 submissions, with concise summaries, visualizations and my comments. The development in each field is summarized, and the future trends are speculated.