Generative artificial intelligence: a historical perspective

Krizhevsky

Sutskever

Hinton

ImageNet classification with deep convolutional neural networks

. In:

Proceedings of the 26th International Conference on Neural Information Processing Systems

Red Hook, NY

Curran Associates

2012

1097

–

105

Radford

Metz

Chintala

Unsupervised representation learning with deep convolutional generative adversarial networks

International Conference on Learning Representations, San Juan, Puerto Rico, 2–4 May 2016

Yule

On a method of investigating periodicities disturbed series, with special reference to Wolfer’s sunspot numbers

Philos Trans R Soc Lond

1927

;

226

267

–

10.1098/rsta.1927.0007

Jarzynski

Equilibrium free-energy differences from nonequilibrium measurements: a master-equation approach

Phys Rev E

1997

;

5018

10.1103/PhysRevE.56.5018

. https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf

Mildenhall

Srinivasan

Tancik

et al.

NeRF: representing scenes as neural radiance fields for view synthesis

Commun ACM

2021

;

–

106

OpenAI

Improving language understanding by generative pre-training

(15 January 2025, date last accessed)

Brown

Mann

Ryder

et al.

Language models are few-shot learners

. In:

Proceedings of the 34th International Conference on Neural Information Processing Systems

Red Hook, NY

Curran Associates

2020

1877

–

901

10.

Achiam

Adler

Agarwal

et al.

GPT-4 technical report

. arXiv: 2303.08774.

11.

Dickmanns

Dynamic Vision for Perception and Control of Motion

Heidelberg

Springer

2007

10.1016/j.jmat.2023.05.001

12.

Liu

Yang

et al.

Generative artificial intelligence and its applications in materials science: current situation and future perspectives

J Mater

2023

;

798

–

816

13.

Buchanan

Duda

Advances in Computers

Amsterdam

Elsevier

1983

10.1007/978-3-642-21004-4_7

14.

Hayes-Roth

Rule-based systems

Commun ACM

1985

;

921

–

15.

Grosan

Abraham

Grosan

et al.

Rule-based expert systems

Intell Syst Mod Approach

2011

;

149

–

16.

Masri

Sultan

Akkila

et al.

Survey of rule-based systems

Int J Acad Inf Syst Res

2019

;

–

17.

Buchanan

BG.

Rule-Based Expert Systems: The MYCIN Experiments of the Stanford Heuristic Programming Project

Boston

Addison-Wesley

1984

18.

Norvig

Paradigms of Artificial Intelligence Programming: Case Studies in Common LISP

Burlington

Morgan Kaufmann Publishers

2014

19.

SYSTRAN Software Inc. SYSTRAN

. https://www.systransoft.com

(15 January 2025, date last accessed)

20.

Fant

The Modern Educational Treatment of Deafness

Manchester

Manchester University Press

1960

21.

Jaakkola

Haussler

Exploiting generative models in discriminative classifiers

. In:

Proceedings of the 12th International Conference on Neural Information Processing Systems

Cambridge, MA

MIT Press

1998

487

–

22.

Jordan

MI.

Learning in Graphical Models

Cambridge, MA

MIT Press

1999

10.1111/j.2517-6161.1977.tb01600.x

23.

Dempster

Laird

Rubin

DB.

Maximum likelihood from incomplete data via the EM algorithm

J R Stat Soc Ser B

1977

;

–

24.

Stratonovich

RL.

Conditional Markov processes

Theory Probab Appl

1960

;

156

–

25.

Baum

Petrie

Statistical inference for probabilistic functions of finite state Markov chains

Ann Math Stat

1966

;

1554

–

10.1214/aoms/1177699147

26.

Harris

TE.

Additive set-valued Markov processes and graphical methods

Ann Probab

1978

;

355

–

10.1214/aop/1176995523

27.

Darroch

Lauritzen

Speed

TP.

Markov fields and log-linear interaction models for contingency tables

Ann Stat

1980

;

522

–

10.1214/aos/1176345006

28.

Markov random field models in computer vision

. In:

Eklundh

(ed)

Computer Vision—ECCV ’94

Berlin

Springer

1994

361

–

10.1089/106652700750050961

29.

Friedman

Linial

Nachman

et al.

Using Bayesian networks to analyze expression data

J Comput Biol

2000

;

601

–

30.

Jurafsky

Speech and Language Processing

Upper Saddle River

Prentice-Hall

2000

31.

Schmidhuber

Heil

Sequential neural text compression

IEEE Trans Neural Netw

1996

;

142

–

32.

Abresch

Langer

The normalized curve shortening flow and homothetic solutions

J Differ Geom

1986

;

175

–

10.4310/jdg/1214440025

33.

Stein

A bound for the error in the normal approximation to the distribution of a sum of dependent random variables

Berkeley Symp Math Stat Prob

1972

;

583

–

602

34.

Song

Sohl-Dickstein

Kingma

et al.

Score-based generative modeling through stochastic differential equations

International Conference on Learning Representations, Kigali, Rwanda, 1–5 May 2023

35.

Zhang

Chen

Diffusion normalizing flow

. In:

Proceedings of the 35th International Conference on Neural Information Processing Systems

Red Hook, NY

Curran Associates

2021

16280

–

10.1016/0370-1573(76)90029-6

36.

Van Kampen

Stochastic differential equations

Phys Rep

1976

;

171

–

228

37.

Song

Ermon

Generative modeling by estimating gradients of the data distribution

. In:

Proceedings of the 33rd International Conference on Neural Information Processing Systems

Red Hook, NY

Curran Associates

2019

11918

–

38.

Jain

Abbeel

Denoising diffusion probabilistic models

. In:

Proceedings of the 34th International Conference on Neural Information Processing Systems

Red Hook, NY

Curran Associates

2020

6840

–

39.

Lipman

Chen

Ben-Hamu

et al.

Flow matching for generative modeling

International Conference on Learning Representations, Kigali, Rwanda, 1–5 May 2023

40.

McCulloch

Pitts

A logical calculus of the ideas immanent in nervous activity

Bull Math Biophys

1943

;

115

–

41.

Stigler

SM.

Gauss and the invention of least squares

Ann Stat

1981

;

465

–

10.1214/aos/1176345451

42.

Merriman

A List of Writings Relating to the Method of Least Squares: With Historical and Critical Notes

London

Academy Press

1877

43.

Jenkin

Affective processes in perception

Psychol Bull

1957

;

100

–

44.

Rosenblatt

The perceptron: a probabilistic model for information storage and organization in the brain

Psychol Rev

1958

;

386

–

408

45.

Fukushima

Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position

Biol Cybern

1980

;

193

–

202

46.

Zhang

Tanida

Itoh

et al.

Shift-invariant pattern recognition neural network and its optical architecture

. In:

Proceedings of Annual Conference of the Japan Society of Applied Physics

Montreal, Canada: Japan Society of Applied Physics

1988

2147

–

47.

Lenz

Beitrag zum Verständnis der magnetischen Erscheinungen in festen Körpern

Z Phys

1920

;

613

–

48.

Ising

Beitrag zur theorie des ferro-und paramagnetismus. Doctoral Thesis

University of Hamburg

1924

49.

Kleene

SC.

Representation of events in nerve nets and finite automata

Automata Stud

1951

;

–

103

50.

Amari

SI.

Learning patterns and pattern sequences by self-organizing nets of threshold elements

IEEE Trans Comput

1972

;

100

1197

–

206

10.1109/T-C.1972.223477

10.1162/neco.1997.9.8.1735

51.

Hopfield

JJ.

Neural networks and physical systems with emergent collective computational abilities

Proc Natl Acad Sci USA

1982

;

2554

–

10.1073/pnas.79.8.2554

52.

Hochreiter

Schmidhuber

Long short-term memory

Neural Comput

1997

;

1735

–

53.

Heilmann

Rigney

An energy-based model of friction and its application to coated systems

Wear

1981

;

195

–

217

10.1016/0043-1648(81)90367-7

54.

BakIr

Predicting Structured Data

Cambridge, MA

MIT Press

2007

55.

Rumelhart

McClelland

JL.

Information Processing in Dynamical Systems: Foundations of Harmony Theory

Cambridge, MA

MIT Press

1986

56.

Linnainmaa

The representation of the cumulative rounding error of an algorithm as a Taylor expansion of the local rounding errors

Master’s Thesis

The University of Helsinki

1970

57.

Werbos

. Applications of advances in nonlinear sensitivity analysis. In:

Drenick

Kozin

(eds.).

System Modeling and Optimization

Berlin

Springer

1982

762

–

58.

Rumelhart

Hinton

Williams

RJ.

Learning representations by back-propagating errors

Nature

1986

;

323

533

–

59.

LeCun

Touresky

Hinton

et al.

A theoretical framework for back-propagation

. In:

Proceedings of the 1988 Connectionist Models Summer School

Burlington

Morgan Kaufmann

1988

–

60.

Mitchell

Machine Learning

New York

McGraw-Hill

1997

61.

Peskun

PH.

Optimum Monte-Carlo sampling using Markov chains

Biometrika

1973

;

607

–

10.1093/biomet/60.3.607

62.

Geyer

CJ.

Practical Markov chain Monte Carlo

Stat Sci

1992

;

473

–

10.1214/ss/1177011137

63.

Brooks

Markov chain Monte Carlo method and its application

J R Stat Soc Ser D

1998

;

–

100

10.1111/1467-9884.00117

10.1080/00031305.1995.10476177

64.

Chib

Greenberg

Understanding the Metropolis-Hastings algorithm

Am Stat

1995

;

327

–

65.

Hoffman

Gelman

et al.

The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo

J Mach Learn Res

2014

;

1593

–

623

66.

Chen

Fox

Guestrin

Stochastic gradient Hamiltonian Monte Carlo

. In:

Proceedings of the 31st International Conference on Machine Learning

PMLR

2014

1683

–

67.

Jordan

Ghahramani

Jaakkola

et al.

An introduction to variational methods for graphical models

Mach Learn

1999

;

183

–

233

10.1023/A:1007665907178

10.1007/978-1-4684-6775-8_3

68.

Wainwright

Jordan

MI.

Graphical models, exponential families, and variational inference

Found Trends Mach Learn

2008

;

–

305

69.

Marr

Thach

A theory of cerebellar cortex

. In:

Vaina

(ed.).

From the Retina to the Neocortex

Boston

Birkhauser

1991

–

70.

Lindell

Martel

Wetzstein

Autoint: automatic integration for fast neural volume rendering

. In:

2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Los Alamitos, CA

IEEE Computer Society

2021

14551

–

10.1109/CVPR46437.2021.01432

71.

Várady

Benko

Reverse engineering B-rep models from multiple point clouds

. In:

Proceedings Geometric Modeling and Processing 2000. Theory and Applications

Piscataway, NJ

IEEE Press

2000

–

10.1109/GMAP.2000.838234

72.

Caon

Voxel-based computational models of real human anatomy: a review

Radiat Environ Biophys

2004

;

229

–

10.1007/s00411-003-0221-8

73.

Blinn

Models of light reflection for computer synthesized pictures

. In:

Proceedings of the 4th Annual Conference on Computer Graphics and Interactive Techniques

New York

Association for Computing Machinery

1977

192

–

10.1145/563858.563893

74.

Bartell

Dereniak EL and Wolfe

The theory and measurement of bidirectional reflectance distribution function (BRDF) and bidirectional transmittance distribution function (BTDF)

. In:

Radiation Scattering in Optical Systems

SPIE

1981

154

–

75.

Akenine-Moller

Haines

Hoffman

Real-time Rendering

Boca Raton

A K Peters/CRC Press

2019

76.

Appel

Some techniques for shading machine renderings of solids

. In:

Proceedings of the April 30–May 2, 1968, Spring Joint Computer Conference

New York

Association for Computing Machinery

1968

–

10.1145/1468075.1468082

77.

Whitted

An improved illumination model for shaded display

. In:

Proceedings of the 6th Annual Conference on Computer Graphics and Interactive Techniques

New York

Association for Computing Machinery

1979

78.

Cline

Talbot

Egbert

Energy redistribution path tracing

ACM Trans Graph

2005

;

1186

–

10.1145/1073204.1073330

79.

Woo

Neider

Davis

et al.

OpenGL Programming Guide: The Official Guide to Learning OpenGL, Version 1.2

Boston

Addison-Wesley

1999

10.1016/j.patcog.2004.01.013

80.

Blythe

The Direct3D 10 system

Conference on Computer Graphics and Interactive Techniques, Boston, MA, USA, July 30–3 August 2006

81.

Jung

GPU implementation of neural networks

Pattern Recognit

2004

;

1311

–

10.1109/TSMC.1971.4308320

82.

Ivakhnenko

AG.

Polynomial theory of complex systems

IEEE Trans Syst Man Cybern

1971

;

364

–

83.

LeCun

Bengio

Hinton

Deep learning

Nature

2015

;

521

436

–

84.

Ranzato

Monga

et al.

Building high-level features using large scale unsupervised learning

. In:

Proceedings of the 29th International Coference on International Conference on Machine Learning

Madison, WI

Omnipress

2012

507

–

10.1162/089976600300015015

85.

Kingma

Welling

Auto-encoding variational Bayes

International Conference on Learning Representations, Banff, Canada, 14–16 April 2014

86.

Gers

Schmidhuber

Cummins

Learning to forget: continual prediction with LSTM

Neural Comput

2000

;

2451

–

87.

Cho

van Merrienboer

Gulcehre

et al.

Learning phrase representations using RNN encoder-decoder for statistical machine translation

. In:

Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Kerrville, TX

Association for Computational Linguistics

2014

1724

–

10.3115/v1/D14-1179

88.

Vaswani

Shazeer

Parmar

et al.

Attention is all you need

. In:

Proceedings of the 31st International Conference on Neural Information Processing Systems

Red Hook, NY

Curran Associates

2017

6000

–

10.1109/ICCV48922.2021.00986

89.

Webster

Kit

Tokenization as the initial phase in NLP

. In:

Proceedings of the 14th Conference on Computational Linguistics

Kerrville, TX

Association for Computational Linguistics

1992

1106

–

10.3115/992424.992434

90.

Dosovitskiy

Beyer

Kolesnikov

et al.

An image is worth 16x16 words: transformers for image recognition at scale

International Conference on Learning Representations, Virtual, 3–7 May 2021

91.

Perozzi

Al-Rfou

Skiena

DeepWalk: online learning of social representations

. In:

Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

New York

Association for Computing Machinery

2014

701

–

10.1145/2623330.2623732

92.

Liu

Lin

Cao

et al.

Swin transformer: hierarchical vision transformer using shifted windows

. In:

2021 IEEE/CVF International Conference on Computer Vision (ICCV)

Los Alamitos, CA

IEEE Computer Society

2021

9992

–

10002

93.

Schmidhuber

Learning to control fast-weight memories: an alternative to dynamic recurrent networks

Neural Comput

1992

;

131

–

10.1162/neco.1992.4.1.131

94.

Katharopoulos

Vyas

Pappas

et al.

Transformers are RNNs: fast autoregressive transformers with linear attention

. In:

Proceedings of the 37th International Conference on Machine Learning

JMLR

2020

5156

–

95.

Tay

Bahri

Yang

et al.

Sparse sinkhorn attention

. In:

Proceedings of the 37th International Conference on Machine Learning

JMLR

2020

9438

–

96.

Yun

Bhojanapalli

Rawat

et al.

Are transformers universal approximators of sequence-to-sequence functions?

International Conference on Learning Representations, New Orleans, LA, 6–9 May 2019

97.

Veličković

Cucurull

Casanova

et al.

Graph attention networks

International Conference on Learning Representations, Vancouver, Canada, 30 April–3 May 2018

98.

Sabour

Frosst

Hinton

Dynamic routing between capsules

. In:

Proceedings of the 31st International Conference on Neural Information Processing Systems

Red Hook, NY

Curran Associates

2017

3859

–

99.

Dao

Mamba: Linear-time sequence modeling with selective state spaces

Conference on Language Modeling, Philadelphia, PA, USA, 7–9 October 2024

10.7551/mitpress/3115.003.0030

100.

Schmidhuber

A possibility for implementing curiosity and boredom in model-building neural controllers

. In:

Proceedings of the First International Conference on Simulation of Adaptive Behavior on From Animals to Animats

Cambridge, MA

MIT Press

1991

222

–

101.

Goodfellow

Pouget-Abadie

Mirza

et al.

Generative adversarial networks

. In:

Proceedings of the 28th International Conference on Neural Information Processing Systems

, Vol.

Cambridge, MA

MIT Press

2014

2672

–

102.

Brock

Donahue

Simonyan

Large scale GAN training for high fidelity natural image synthesis

International Conference on Learning Representations, New Orleans, LA, 6–9 May 2019

103.

Karras

Aila

Laine

et al.

Progressive growing of GANs for improved quality, stability, and variation

International Conference on Learning Representations, Vancouver, Canada, 30 April–3 May 2018

10.1109/CVPR52688.2022.00361

104.

Karras

Laine

Aila

A style-based generator architecture for generative adversarial networks

. In:

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Los Alamitos, CA

IEEE Computer Society

2019

4396

–

405

105.

Skorokhodov

Tulyakov

Elhoseiny

StyleGAN-V: a continuous video generator with the price, image quality and perks of StyleGAN2

. In:

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Los Alamitos, CA

IEEE Computer Society

2022

3616

–

106.

Arjovsky

Chintala

Bottou

Wasserstein generative adversarial networks

. In:

Proceedings of the 34th International Conference on Machine Learning

PMLR

2017

214

–

107.

Kingma

Welling

Auto-encoding variational Bayes

International Conference on Learning Representations, Banff, Canada, 14–16 April 2014

108.

Ranganath

Gerrish

Blei

Black box variational inference

. In:

Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics

PMLR

2014

814

–

109.

Kucukelbir

Ranganath

Gelman

et al.

Automatic variational inference in Stan

. In:

Proceedings of the 29th International Conference on Neural Information Processing Systems

, Vol.

Cambridge, MA

MIT Press

2015

568

–

10.1109/CVPR52688.2022.01042

110.

Rombach

Blattmann

Lorenz

et al.

High-resolution image synthesis with latent diffusion models

. In:

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Los Alamitos, CA

IEEE Computer Society

2022

10674

–

111.

Dhariwal P and Nichol

Diffusion models beat GANs on image synthesis

. In:

Proceedings of the 35th International Conference on Neural Information Processing Systems

Red Hook, NY

Curran Associates

2021

8780

–

112.

Salimans

Progressive distillation for fast sampling of diffusion models

International Conference on Learning Representations, Virtual, 25–29 April 2022

113.

Oord

Avd

Dieleman

Zen

et al.

Wavenet: a generative model for raw audio

. In:

Proc. 9th ISCA Workshop on Speech Synthesis Workshop, Sunnyvale, CA, USA, 13–15 September 2016

10.1109/CVPR52688.2022.00542

114.

Deepfakes

Deepfakes Software

. https://github.com/deepfakes/faceswap

(8 May 2024, date last accessed)

115.

Fridovich-Keil

Tancik

et al.

Plenoxels: radiance fields without neural networks

. In:

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Los Alamitos, CA

IEEE Computer Society

2022

5491

–

500

116.

Kerbl

Kopanas

Leimkühler

et al.

3D Gaussian splatting for real-time radiance field rendering

ACM Trans Graph

2023

;

–

117.

Bommasani

Hudson

Adeli

et al.

On the opportunities and risks of foundation models

. arXiv: 2108.07258.

118.

Devlin

Chang

Lee

et al.

BERT: pre-training of deep bidirectional transformers for language understanding

. In:

Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics

Kerrville, TX

Association for Computational Linguistics

2019

4171

–

. https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf

119.

Wei

Tay

Bommasani

et al.

Emergent abilities of large language models

. arXiv: 2206.07682.

120.

OpenAI

Language models are unsupervised multitask learners

(15 January 2025, date last accessed)

121.

Wei

Bosma

Zhao

et al.

Finetuned language models are zero-shot learners

International Conference on Learning Representations, Virtual, 25–29 April 2022

122.

Ouyang

Jiang

et al.

Training language models to follow instructions with human feedback

. In:

Proceedings of the 36th International Conference on Neural Information Processing Systems

Red Hook, NY

Curran Associates

2022

27730

–

123.

Team

Anil

Borgeaud

et al.

Gemini: a family of highly capable multimodal models

. arXiv: 2312.11805.

124.

Chowdhery

Narang

Devlin

et al.

PaLM: Scaling language modeling with pathways

J Mach Learn Res

2023

;

240

10.18653/v1/2022.acl-long.26

125.

Touvron

Lavril

Izacard

et al.

LLaMA: open and efficient foundation language models

. arXiv: 2302.13971.

126.

Qian

Liu

et al.

GLM: general language model pretraining with autoregressive blank infilling

. In:

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics

Kerrville, TX

Association for Computational Linguistics

2022

320

–

127.

Liu

Feng

Xue

et al.

DeepSeek-V3 technical report

. arXiv: 2412.19437.

128.

Kojima

Reid

et al.

Large language models are zero-shot reasoners

. In:

Proceedings of the 36th International Conference on Neural Information Processing Systems

Red Hook, NY

Curran Associates

2022

22199

–

213

129.

Wei

Wang

Schuurmans

et al.

Chain-of-thought prompting elicits reasoning in large language models

. In:

Proceedings of the 36th International Conference on Neural Information Processing Systems

Red Hook, NY

Curran Associates

2022

24824

–

130.

Zhou

Schärli

Hou

et al.

Least-to-most prompting enables complex reasoning in large language models

International Conference on Learning Representations, Kigali, Rwanda, 1–5 May 2023

131.

Radford

Kim

Hallacy

et al.

Learning transferable visual models from natural language supervision

. In:

Proceedings of the 38th International Conference on Machine Learning

PMLR

2021

8748

–

132.

Tsimpoukelli

Menick

Cabi

et al.

Multimodal few-shot learning with frozen language models

. In:

Proceedings of the 35th International Conference on Neural Information Processing Systems

Red Hook, NY

Curran Associates

2021

200

–

133.

Alayrac

Donahue

Luc

et al.

Flamingo: a visual language model for few-shot learning

. In:

Proceedings of the 36th International Conference on Neural Information Processing Systems

Red Hook, NY

Curran Associates

2022

23716

–

10.1109/CVPR52729.2023.00976

134.

Kang

Zhu

Zhang

et al.

Scaling up GANs for text-to-image synthesis

. In:

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Los Alamitos, CA

IEEE Computer Society

2023

10124

–

135.

Ramesh

Pavlov

Goh

et al.

Zero-shot text-to-image generation

. In:

Proceedings of the 38th International Conference on Machine Learning

PMLR

2021

8821

–

136.

Computer Vision and Learning research group at Ludwig Maximilian University of Munich. Stable Diffusion

. https://github.com/CompVis/stable-diffusion

(8 May 2024, date last accessed)

137.

Black-forest-labs. FLUX

. https://github.com/black-forest-labs/flux

(8 May 2024, date last accessed)

138.

Raffel

Shazeer

Roberts

et al.

Exploring the limits of transfer learning with a unified text-to-text transformer

J Mach Learn Res

2020

;

140

10.1109/CVPR52688.2022.00246

139.

Kim

Kwon

DiffusionCLIP: text-guided diffusion models for robust image manipulation

. In:

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Los Alamitos, CA

IEEE Computer Society

2022

2416

–

140.

Yang

Chen

Liao

Uni-paint: a unified framework for multimodal image inpainting with pretrained diffusion model

. In:

Proceedings of the 31st ACM International Conference on Multimedia

New York

Association for Computing Machinery

2023

3190

–

141.

Brooks

Holynski

Efros

InstructPix2Pix: learning to follow image editing instructions

. In:

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Los Alamitos, CA

IEEE Computer Society

2023

18392

–

402

10.1109/CVPR52729.2023.00585

142.

Mokady

Hertz

Aberman

et al.

Null-text inversion for editing real images using guided diffusion models

. In:

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Los Alamitos, CA

IEEE Computer Society

2023

6038

–

143.

Zhang

Huang

Tang

et al.

Inversion-based style transfer with diffusion models

. In:

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Los Alamitos, CA

IEEE Computer Society

2023

10146

–

144.

Mou

Wang

Song

et al.

DragonDiffusion: enabling drag-style manipulation on diffusion models

International Conference on Learning Representations, Vienna, Austria, 7–11 May 2024

10.1109/CVPR52729.2023.00582

145.

Kawar

Zada

Lang

et al.

Imagic: text-based real image editing with diffusion models

. In:

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Los Alamitos, CA

IEEE Computer Society

2023

6007

–

146.

Avrahami

Fried

Lischinski

Blended latent diffusion

ACM Trans Graph

2023

;

149

10.1145/3592450

10.1007/s11263-024-02295-1

147.

OpenAI. Sora

. https://openai.com/sora

(8 May 2024, date last accessed)

148.

Wang

Chen

et al.

Lavie: high-quality video generation with cascaded latent diffusion models

Int J Comput Vis

2024

; doi: 10.1007/s11263-024-02295-1.

. https://github.com/PKU-YuanGroup/Open-Sora-Plan

149.

PKU-Yuan Group. Open Sora Plan

(8 May 2024, date last accessed)

150.

Höppe

Mehrjou

Bauer

et al.

Diffusion models for video prediction and infilling

Advances in Neural Information Processing Systems, New Orleans, LA, USA, 2022

151.

Suno Inc. Suno AI

. https://suno.com

(8 May 2024, date last accessed)

152.

First

Rabe

Ringer

et al.

Baldur: whole-proof generation and repair with large language models

. In:

Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering

New York

Association for Computing Machinery

2023

1229

–

10.1145/3611643.3616243

10.61577/ibsm.2024.100002

153.

Rane

ChatGPT and similar generative artificial intelligence (AI) for smart industry: role, challenges and opportunities for industry 4.0, industry 5.0 and society 5.0

Innov Bus Strateg Manag

2024

;

–

154.

Mankowitz

Michi

Zhernov

et al.

Faster sorting algorithms discovered using deep reinforcement learning

Nature

2023

;

618

257

–

10.1038/s41586-023-06004-9

155.

Ha D and Schmidhuber

Recurrent world models facilitate policy evolution

. In:

Proceedings of the 32nd International Conference on Neural Information Processing Systems

Red Hook, NY

Curran Associates

2018

2455

–

10.1109/MNET.2024.3391767

156.

Zhang

Xiong

et al.

Generative AI-enabled vehicular networks: fundamentals, framework, and case study

IEEE Netw

2024

;

259

–

157.

Brohan

Brown

Carbajal

et al.

RT-1: Robotics transformer for real-world control at scale

Robotics: Science and System, Daegu, Republic of Korea, 10–14 July, 2023

158.

Ahn

Brohan

Brown

et al.

Do as I can, not as I say: grounding language in robotic affordances

. In:

Proceedings of The 6th Conference on Robot Learning

2023

;

205

287

–

318

10.1007/s11432-024-4222-0

159.

Chen

Guo

et al.

The rise and potential of large language model based agents: a survey

Sci China Inf Sci

2025

;

121101

10.1109/COMST.2024.3353265

160.

Niyato

et al.

Unleashing the power of edge-cloud generative AI in mobile networks: a survey of AIGC services

IEEE Commun Surv Tutor

2024

;

1127

–

10.15302/J-SSCAE-2021.03.005

161.

Liu

Zhang

Duan

et al.

Technical countermeasures for security risks of artificial general intelligence

Strategic Stud Chin Acad Eng

2021

;

–

10.1109/MNET.2024.3422241

162.

Niyato

Kang

et al.

The age of generative AI and AI-generated everything

IEEE Netw

2024

;

501

–

10.1609/aaai.v38i19.30160

163.

Tan

Chen

Zhang

et al.

Sparsity-guided holistic explanation for LLMs with interpretable inference-time intervention

. In:

Proceedings of the 38th AAAI Conference on Artificial Intelligence

Washington, DC

AAAI Press

2024

21619

–

164.

Liu

Tegmark

et al.

Poisson flow generative models

. In:

Proceedings of the 36th International Conference on Neural Information Processing Systems

Red Hook, NY

Curran Associates

2022

16782

–