Table 3. Open in new tab Perplexity...

	Germline residues	Nongermline residues
ESM-2	1.91	4.12	2.54	6.11	32.03	24.36	20.85	23.20	19.37	24.29
AntiBERTy	1.05	1.10	1.17	1.28	29.64	21.51	18.44	40.14	21.75	16.95
AbLang-1	1.03	1.08	1.07	1.16	25.80	17.73	14.47	52.14	25.72	16.75
Ab-Unpaired	1.02	1.07	1.01	1.05	26.81	18.95	14.42	37.60	19.37	17.25
Ab-Paired	1.02	1.06	1.02	1.05	27.24	18.70	14.23	38.95	19.25	16.98
Ab-FL	1.10	1.17	1.09	1.16	10.33	11.18	12.69	10.82	10.24	11.04
Ab-ModMask	1.11	1.18	1.09	1.17	10.26	11.13	13.18	10.78	10.19	11.42
Ab-FT	1.11	1.18	1.10	1.18	10.88	11.91	13.67	11.25	10.63	12.29
AbLang-2	1.10	1.17	1.09	1.16	9.92	11.13	12.47	10.09	9.54	10.77

Germline residues

Nongermline residues

Heavy

Light

Heavy

Light

FWR

CDR1/2

FWR

CDR1/2

FWR

CDR1/2

CDR3

FWR

CDR1/2

CDR3

ESM-2

1.91

4.12

2.54

6.11

32.03

24.36

20.85

23.20

19.37

24.29

AntiBERTy

1.05

1.10

1.17

1.28

29.64

21.51

18.44

40.14

21.75

16.95

AbLang-1

1.03

1.08

1.07

1.16

25.80

17.73

14.47

52.14

25.72

16.75

Ab-Unpaired

1.02

1.07

1.01

1.05

26.81

18.95

14.42

37.60

19.37

17.25

Ab-Paired

1.02

1.06

1.02

1.05

27.24

18.70

14.23

38.95

19.25

16.98

Ab-FL

1.10

1.17

1.09

1.16

10.33

11.18

12.69

10.82

10.24

11.04

Ab-ModMask

1.11

1.18

1.09

1.17

10.26

11.13

13.18

10.78

10.19

11.42

Ab-FT

1.11

1.18

1.10

1.18

10.88

11.91

13.67

11.25

10.63

12.29

AbLang-2

1.10

1.17

1.09

1.16

9.92

11.13

12.47

10.09

9.54

10.77

	Germline residues	Nongermline residues
ESM-2	1.91	4.12	2.54	6.11	32.03	24.36	20.85	23.20	19.37	24.29
AntiBERTy	1.05	1.10	1.17	1.28	29.64	21.51	18.44	40.14	21.75	16.95
AbLang-1	1.03	1.08	1.07	1.16	25.80	17.73	14.47	52.14	25.72	16.75
Ab-Unpaired	1.02	1.07	1.01	1.05	26.81	18.95	14.42	37.60	19.37	17.25
Ab-Paired	1.02	1.06	1.02	1.05	27.24	18.70	14.23	38.95	19.25	16.98
Ab-FL	1.10	1.17	1.09	1.16	10.33	11.18	12.69	10.82	10.24	11.04
Ab-ModMask	1.11	1.18	1.09	1.17	10.26	11.13	13.18	10.78	10.19	11.42
Ab-FT	1.11	1.18	1.10	1.18	10.88	11.91	13.67	11.25	10.63	12.29
AbLang-2	1.10	1.17	1.09	1.16	9.92	11.13	12.47	10.09	9.54	10.77

Table 3.

Open in new tab

Perplexity comparison between the protein language model (LM) ESM-2 (Lin et al. 2023), the antibody-specific LMs AntiBERTy (Ruffolo et al. 2021) and AbLang-1 (Olsen et al. 2022b), and our new selection of antibody-specific LMs (see Section 2.4).^a

	Germline residues				Nongermline residues
	Heavy		Light		Heavy			Light
	FWR	CDR1/2	FWR	CDR1/2	FWR	CDR1/2	CDR3	FWR	CDR1/2	CDR3
ESM-2	1.91	4.12	2.54	6.11	32.03	24.36	20.85	23.20	19.37	24.29
AntiBERTy	1.05	1.10	1.17	1.28	29.64	21.51	18.44	40.14	21.75	16.95
AbLang-1	1.03	1.08	1.07	1.16	25.80	17.73	14.47	52.14	25.72	16.75
Ab-Unpaired	1.02	1.07	1.01	1.05	26.81	18.95	14.42	37.60	19.37	17.25
Ab-Paired	1.02	1.06	1.02	1.05	27.24	18.70	14.23	38.95	19.25	16.98
Ab-FL	1.10	1.17	1.09	1.16	10.33	11.18	12.69	10.82	10.24	11.04
Ab-ModMask	1.11	1.18	1.09	1.17	10.26	11.13	13.18	10.78	10.19	11.42
Ab-FT	1.11	1.18	1.10	1.18	10.88	11.91	13.67	11.25	10.63	12.29
AbLang-2	1.10	1.17	1.09	1.16	9.92	11.13	12.47	10.09	9.54	10.77

	Germline residues				Nongermline residues
	Heavy		Light		Heavy			Light
	FWR	CDR1/2	FWR	CDR1/2	FWR	CDR1/2	CDR3	FWR	CDR1/2	CDR3
ESM-2	1.91	4.12	2.54	6.11	32.03	24.36	20.85	23.20	19.37	24.29
AntiBERTy	1.05	1.10	1.17	1.28	29.64	21.51	18.44	40.14	21.75	16.95
AbLang-1	1.03	1.08	1.07	1.16	25.80	17.73	14.47	52.14	25.72	16.75
Ab-Unpaired	1.02	1.07	1.01	1.05	26.81	18.95	14.42	37.60	19.37	17.25
Ab-Paired	1.02	1.06	1.02	1.05	27.24	18.70	14.23	38.95	19.25	16.98
Ab-FL	1.10	1.17	1.09	1.16	10.33	11.18	12.69	10.82	10.24	11.04
Ab-ModMask	1.11	1.18	1.09	1.17	10.26	11.13	13.18	10.78	10.19	11.42
Ab-FT	1.11	1.18	1.10	1.18	10.88	11.91	13.67	11.25	10.63	12.29
AbLang-2	1.10	1.17	1.09	1.16	9.92	11.13	12.47	10.09	9.54	10.77

While most of the models are near perfect at predicting masked germline residues, predictions for nongermline (NGL) residues show significantly higher perplexities. For ESM-2, AntiBERTy, AbLang-1, Ab-Unpaired, and Ab-Paired NGL perplexities are close to or worse than a random prediction. The largest improvement for NGL prediction came from switching to focal loss. Scaling up the model also improved performance, e.g. as seen by AbLang-2’s performances compared to Ab-FT. The best perplexity for each region is shown in bold.

This Feature Is Available To Subscribers Only