History of AI
Why start with history?
Section titled “Why start with history?”To understand why ChatGPT or Claude only became possible in the 2020s, you have to know what came before. AI was not born in 2022 — it has existed since the 1950s and has gone through three big waves.
Every wave added capability rather than replacing the previous one. Rule-based systems, classical ML, and modern LLMs all coexist in production today.
A 70-year timeline
Section titled “A 70-year timeline”flowchart TB Y1950["<b>1950</b><br/>Turing test · Symbolic AI"]:::c1 Y1956["<b>1956</b><br/>Dartmouth workshop coins the word 'AI'"]:::c2 Y1980["<b>1980</b><br/>Expert systems boom · Backpropagation"]:::c3 Y1997["<b>1997</b><br/>Deep Blue beats Kasparov"]:::c4 Y2006["<b>2006</b><br/>Hinton revives deep networks"]:::c5 Y2012["<b>2012</b><br/>AlexNet wins ImageNet (CNN)"]:::c6 Y2017["<b>2017</b><br/>'Attention is All You Need' — Transformer"]:::c7 Y2020["<b>2020</b><br/>GPT-3"]:::c8 Y2022["<b>2022</b><br/>ChatGPT goes mainstream"]:::c9 Y2024["<b>2024</b><br/>Multimodal LLMs · AI agents"]:::c10 Y1950 --> Y1956 --> Y1980 --> Y1997 --> Y2006 --> Y2012 --> Y2017 --> Y2020 --> Y2022 --> Y2024 classDef c1 fill:#b8b5f5,stroke:#5752c4,color:#1e1b4b classDef c2 fill:#fff89c,stroke:#c9a300,color:#3f2e00 classDef c3 fill:#ccf2cc,stroke:#3aa83a,color:#0e3a0e classDef c4 fill:#d9aef0,stroke:#8a3fbb,color:#3a134f classDef c5 fill:#ffb3d9,stroke:#c9468c,color:#560b35 classDef c6 fill:#ff9999,stroke:#c43434,color:#4a0d0d classDef c7 fill:#ffd8a3,stroke:#c97a1a,color:#4a2a05 classDef c8 fill:#ffcc88,stroke:#cc7a00,color:#4a2a00 classDef c9 fill:#c8f0a0,stroke:#5fa830,color:#1e3a08 classDef c10 fill:#a8f0d0,stroke:#2da378,color:#0c3a26
flowchart LR A["<b>1950</b><br/>Turing test<br/>Symbolic AI"]:::c1 --> B["<b>1956</b><br/>Dartmouth<br/>coins 'AI'"]:::c2 --> C["<b>1980</b><br/>Expert systems<br/>Backpropagation"]:::c3 --> D["<b>1997</b><br/>Deep Blue<br/>beats Kasparov"]:::c4 classDef c1 fill:#b8b5f5,stroke:#5752c4,color:#1e1b4b classDef c2 fill:#fff89c,stroke:#c9a300,color:#3f2e00 classDef c3 fill:#ccf2cc,stroke:#3aa83a,color:#0e3a0e classDef c4 fill:#d9aef0,stroke:#8a3fbb,color:#3a134f
flowchart LR E["<b>2006</b><br/>Hinton revives<br/>deep networks"]:::c5 --> F["<b>2012</b><br/>AlexNet wins<br/>ImageNet (CNN)"]:::c6 --> G["<b>2017</b><br/>Transformer<br/>'Attention is<br/>All You Need'"]:::c7 --> H["<b>2020</b><br/>GPT-3"]:::c8 --> I["<b>2022</b><br/>ChatGPT<br/>mainstream"]:::c9 --> J["<b>2024</b><br/>Multimodal<br/>+ AI agents"]:::c10 classDef c5 fill:#ffb3d9,stroke:#c9468c,color:#560b35 classDef c6 fill:#ff9999,stroke:#c43434,color:#4a0d0d classDef c7 fill:#ffd8a3,stroke:#c97a1a,color:#4a2a05 classDef c8 fill:#ffcc88,stroke:#cc7a00,color:#4a2a00 classDef c9 fill:#c8f0a0,stroke:#5fa830,color:#1e3a08 classDef c10 fill:#a8f0d0,stroke:#2da378,color:#0c3a26
The three big waves of AI
Section titled “The three big waves of AI”flowchart LR A["Wave 1<br/>Symbolic AI<br/>1950 - 1980"] --> B["Wave 2<br/>Classical ML<br/>1980 - 2010"] B --> C["Wave 3<br/>Deep Learning + LLMs<br/>2012 - today"] A -. coexists .-> C B -. coexists .-> C
Wave 1 — Symbolic AI (1950–1980): “if… then…”
Section titled “Wave 1 — Symbolic AI (1950–1980): “if… then…””The earliest AI programs were rule-based systems written by hand by humans.
Example: a 1970s medical expert system contains 500 rules like “if the patient has a fever AND a dry cough THEN suggest the flu”.
It works well… as long as the rule exists. The moment you step outside the planned domain, the system is blind. This is the limitation that the next wave fixes.
Wave 2 — Classical Machine Learning (1980–2010): “learn from examples”
Section titled “Wave 2 — Classical Machine Learning (1980–2010): “learn from examples””Instead of writing rules, we show examples to an algorithm, and it figures out the rules on its own.
Example: we give 10,000 emails labelled “spam” or “not spam”, and the algorithm learns to spot spam.
Key methods: linear regression, decision trees, SVM, k-NN, random forests. Still used today for many tabular-data problems.
Wave 3 — Deep Learning and LLMs (2012–today)
Section titled “Wave 3 — Deep Learning and LLMs (2012–today)”In 2012, a team (Krizhevsky, Sutskever, Hinton) wins the ImageNet competition with a CNN (convolutional neural network). Accuracy jumps by a wide margin, and academia realises that deep networks + lots of data + GPUs define the new state of the art.
What followed:
- 2014–2017 — image recognition and machine translation reach near-human level.
- 2017 — Google publishes Attention is All You Need, inventing the Transformer architecture.
- 2020 — OpenAI releases GPT-3, the first truly general-purpose large language model.
- 2022 — ChatGPT ships; AI hits the mainstream.
- 2024–2026 — multimodal LLMs (text + image + audio), code agents, video generation, agentic systems.
Key takeaways
Section titled “Key takeaways”- AI is 70 years old, not 4.
- Three big waves: rules, classical ML, deep learning / LLMs.
- The modern turning point is 2012 (CNN on ImageNet) followed by 2017 (Transformer).
- No wave replaced the previous one — rules, classical ML and LLMs coexist in real-world systems.
Next: From rules to data — what changed in the way we build software when ML arrived.