AI Canon | Shenzhen BoldVenture Solutions Inc.

Research in artificial intelligence is increasing at an exponential rate. It's difficult for AI experts to keep up with everything new being published, and even harder for beginners to know where to start.

So, in this post, we’re sharing a curated list of resources we’ve relied on to get smarter about modern AI. We call it the "AI Canon" because these papers, blog posts, courses, and guides have had an outsized impact on the field over the past several years.

We start with a gentle introduction to transformer and latent diffusion models, which are fueling the current AI wave. Next, we go deep on technical learning resources; practical guides to building with large language models (LLMs); and analysis of the AI market. Finally, we include a reference list of landmark research results, starting with "Attention is All You Need" — the 2017 paper by Google that introduced the world to transformer models and ushered in the age of generative AI.

These articles require no specialized background and can help you get up to speed quickly on the most important parts of the modern AI wave.

These resources provide a base understanding of fundamental ideas in machine learning and AI, from the basics of deep learning to university-level courses from AI experts.

There are countless resources — some better than others — attempting to explain how LLMs work. Here are some of our favorites, targeting a wide range of readers/viewers.

A new application stack is emerging with LLMs at the core. While there isn't a lot of formal education available on this topic yet, we pulled out some of the most useful resources we’ve found.

We’ve all marveled at what generative AI can produce, but there are still a lot of questions about what it all means. Which products and companies will survive and thrive? What happens to artists? How should companies use it? How will it affect literally jobs and society at large? Here are some attempts at answering these questions.

Most of the amazing AI products we see today are the result of no-less-amazing research, carried out by experts inside large companies and leading universities. Lately, we’ve also seen impressive work from individuals and the open source community taking popular projects into new directions, for example by creating automated agents or porting models onto smaller hardware footprints.

Here's a collection of many of these papers and projects, for folks who really want to dive deep into generative AI. (For research papers and projects, we’ve also included links to the accompanying blog posts or websites, where available, which tend to explain things at a higher level. And we’ve included original publication years so you can track foundational research over time.)

New models

Model improvements (e.g. fine-tuning, retrieval, attention)

Code generation

Video generation

Human biology and medical data

Audio generation

Multi-dimensional image generation

Special thanks to Jack Soslow, Jay Rughani, Marco Mascorro, Martin Casado, Rajko Radovanovic, and Vijay Pande for their contributions to this piece, and to the entire a16z team for an always informative discussion about the latest in AI. And thanks to Sonal Chokshi and the crypto team for building a long series of canons at the firm.

* * *

The views expressed here are those of the individual AH Capital Management, L.L.C. ("a16z") personnel quoted and are not the views of a16z or its affiliates. Certain information contained in here has been obtained from third-party sources, including from portfolio companies of funds managed by a16z. While taken from sources believed to be reliable, a16z has not independently verified such information and makes no representations about the enduring accuracy of the information or its appropriateness for a given situation. In addition, this content may include third-party advertisements; a16z has not reviewed such advertisements and does not endorse any advertising content contained therein.

This content is provided for informational purposes only, and should not be relied upon as legal, business, investment, or tax advice. You should consult your own advisers as to those matters. References to any securities or digital assets are for illustrative purposes only, and do not constitute an investment recommendation or offer to provide investment advisory services. Furthermore, this content is not directed at nor intended for use by any investors or prospective investors, and may not under any circumstances be relied upon when making a decision to invest in any fund managed by a16z. (An offering to invest in an a16z fund will be made only by the private placement memorandum, subscription agreement, and other relevant documentation of any such fund and should be read in their entirety.) Any investments or portfolio companies mentioned, referred to, or described are not representative of all investments in vehicles managed by a16z, and there can be no assurance that the investments will be profitable or that other investments made in the future will have similar characteristics or results. A list of investments made by funds managed by Andreessen Horowitz (excluding investments for which the issuer has not provided permission for a16z to disclose publicly as well as unannounced investments in publicly traded digital assets) is available at https://a16z.com/investments/.

Charts and graphs provided within are for informational purposes solely and should not be relied upon when making any investment decision. Past performance is not indicative of future results. The content speaks only as of the date indicated. Any projections, estimates, forecasts, targets, prospects, and/or opinions expressed in these materials are subject to change without notice and may differ or be contrary to opinions expressed by others. Please see https://a16z.com/disclosures for additional important information.

Table of contents Software 2.0 State of GPT What is ChatGPT doing … and why does it work? Transformers, explained How Stable Diffusion works Deep learning in a nutshell: core concepts Practical deep learning for coders Word2vec explained Yes you should understand backprop Stanford CS229 Stanford CS224N The illustrated transformer The annotated transformer Let's build GPT: from scratch, in code, spelled out The illustrated Stable Diffusion : RLHF: Reinforcement Learning from Human Feedback Reinforcement learning from human feedback Stanford CS25 Stanford CS324 Predictive learning, NIPS 2016 AI for full-self driving at Tesla The scaling hypothesis Chinchilla's wild implications A survey of large language models Sparks of artificial general intelligence: Early experiments with GPT-4 The AI revolution: How Auto-GPT unleashes a new era of automation and creativity The Waluigi Effect Build a GitHub support bot with GPT3, LangChain, and Python Building LLM applications for production Prompt Engineering Guide Prompt injection: What's the worst that can happen? OpenAI cookbook Pinecone learning center LangChain docs LLM Bootcamp Hugging Face Transformers Chatbot Arena Open LLM Leaderboard Who owns the generative AI platform? Navigating the high cost of AI compute Art isn't dead, it's just machine-generated The generative AI revolution in games For B2B generative AI apps, is less more? Financial services will embrace generative AI faster than you think Generative AI: The next consumer platform To make a real difference in health care, AI will need to learn like we do The new industrial revolution: Bio x AI On the opportunities and risks of foundation models State of AI Report GPTs are GPTs: An early look at the labor market impact potential of large language models Deep medicine: How artificial intelligence can make healthcare human again Large language models Attention is all you need BERT: pre-training of deep bidirectional transformers for language understanding Improving language understanding by generative pre-training Language models are few-shot learners Training language models to follow instructions with human feedback LaMDA: language models for dialog applications PaLM: Scaling language modeling with pathways OPT: Open Pre-trained Transformer language models Training compute-optimal large language models GPT-4 technical report LLaMA: Open and efficient foundation language models Alpaca: A strong, replicable instruction-following model Model improvements (e.g. fine-tuning, retrieval, attention) Deep reinforcement learning from human preferences Retrieval-augmented generation for knowledge-intensive NLP tasks Improving language models by retrieving from trillions of tokens LoRA: Low-rank adaptation of large language models Constitutional AI (2022) FlashAttention: Fast and memory-efficient exact attention with IO-awareness Hungry hungry hippos: Towards language modeling with state space models Image generation models Learning transferable visual models from natural language supervision Zero-shot text-to-image generation High-resolution image synthesis with latent diffusion models Photorealistic text-to-image diffusion models with deep language understanding DreamBooth: Fine tuning text-to-image diffusion models for subject-driven generation Adding conditional control to text-to-image diffusion models Agents A path towards autonomous machine intelligence ReAct: Synergizing reasoning and acting in language models Generative agents: Interactive simulacra of human behavior Reflexion: an autonomous agent with dynamic memory and self-reflection Toolformer: Language models can teach themselves to use tools Auto-GPT: An autonomous GPT-4 experiment BabyAGI Other data modalities Code generation Evaluating large language models trained on code Competition-level code generation with AlphaCode CodeGen: An open large language model for code with multi-turn program synthesis Video generation Make-A-Video: Text-to-video generation without text-video data Imagen Video: High definition video generation with diffusion models Human biology and medical data Strategies for pre-training graph neural networks Improved protein structure prediction using potentials from deep learning Large language models encode clinical knowledge Audio generation Jukebox: A generative model for music AudioLM: a language modeling approach to audio generation MusicLM: Generating nusic from text Multi-dimensional image generation NeRF: Representing scenes as neural radiance fields for view synthesis DreamFusion: Text-to-3D using 2D diffusion