Estimating stellar atmospheric parameters based on LASSO and support-vector regression

Extracted features for estimating atmospheric parameters. WP is the wavelength position represented by a two-dimensional vector [a, b], where a and b are the starting and ending wavelengths (Å).

Label	WP (Å)	Label	WP (Å)	Label	WP (Å)	Label	WP (Å)
(a) Extracted features for estimating T_eff.
T₁	[3932.439, 3946.045]	T₂	[4217.567, 4232.159]	T₃	[4780.375, 4796.915]	T₄	[4851.343, 4868.128]
T₅	[5033.406, 5050.821]	T₆	[5070.631, 5088.175]	T₇	[5108.131, 5125.804]	T₈	[5126.984, 5144.723]
T₉	[5145.908, 5163.712]	T₁₀	[5164.901, 5182.771]	T₁₁	[5203.098, 5221.100]	T₁₂	[6562.389, 6585.094]
T₁₃	[8524.364, 8553.857]	T₁₄	[8650.914, 8680.845]	T₁₅	[8747.058, 8777.321]	T₁₆	[8844.270, 8874.870]
T₁₇	[9008.697, 9039.866]
(b) Extracted features for estimating log g.
L₁	[3818.229, 3831.440]	L₂	[3889.215, 3902.672]	L₃	[3932.439, 3946.045]	L₄	[4095.076, 4109.245]
L₅	[4295.978, 4310.841]	L₆	[4540.064, 4555.772]	L₇	[4556.821, 4572.587]	L₈	[4573.640, 4589.464]
L₉	[4658.671, 4674.789]	L₁₀	[4833.503, 4850.226]	L₁₁	[4851.343, 4868.128]	L₁₂	[4869.249, 4886.096]
L₁₃	[4887.221, 4904.130]	L₁₄	[4923.365, 4940.399]	L₁₅	[5164.901, 5182.771]	L₁₆	[5183.964, 5201.900]
L₁₇	[5222.302, 5240.371]	L₁₈	[5241.577, 5259.712]	L₁₉	[5280.341, 5298.611]	L₂₀	[5299.831, 5318.167]
L₂₁	[5319.392, 5337.796]	L₂₂	[5418.287, 5437.033]	L₂₃	[5498.725, 5517.750]	L₂₄	[6562.389, 6585.094]
(c) Extracted features for estimating [Fe/H].
F₁	[3932.439, 3946.045]	F₂	[3990.819, 4004.626]	F₃	[4005.549, 4019.407]	F₄	[4020.333, 4034.242]
F₅	[4035.172, 4049.133]	F₆	[4506.735, 4522.327]	F₇	[4607.465, 4623.406]	F₈	[4745.282, 4761.700]
F₉	[4780.375, 4796.915]	F₁₀	[4798.019, 4814.620]	F₁₁	[4815.729, 4832.390]	F₁₂	[4851.343, 4868.128]
F₁₃	[4869.249, 4886.096]	F₁₄	[4941.536, 4958.633]	F₁₅	[4959.775, 4976.935]	F₁₆	[5051.984, 5069.463]
F₁₇	[5108.131, 5125.804]	F₁₈	[5241.577, 5259.712]	F₁₉	[5260.924, 5279.126]	F₂₀	[5280.341, 5298.611]
F₂₁	[5299.831, 5318.167]	F₂₂	[5398.362, 5417.039]	F₂₃	[5438.285, 5457.101]	F₂₄	[8524.364, 8553.857]
F₂₅	[8650.914, 8680.845]

Label	WP (Å)	Label	WP (Å)	Label	WP (Å)	Label	WP (Å)
(a) Extracted features for estimating T_eff.
T₁	[3932.439, 3946.045]	T₂	[4217.567, 4232.159]	T₃	[4780.375, 4796.915]	T₄	[4851.343, 4868.128]
T₅	[5033.406, 5050.821]	T₆	[5070.631, 5088.175]	T₇	[5108.131, 5125.804]	T₈	[5126.984, 5144.723]
T₉	[5145.908, 5163.712]	T₁₀	[5164.901, 5182.771]	T₁₁	[5203.098, 5221.100]	T₁₂	[6562.389, 6585.094]
T₁₃	[8524.364, 8553.857]	T₁₄	[8650.914, 8680.845]	T₁₅	[8747.058, 8777.321]	T₁₆	[8844.270, 8874.870]
T₁₇	[9008.697, 9039.866]
(b) Extracted features for estimating log g.
L₁	[3818.229, 3831.440]	L₂	[3889.215, 3902.672]	L₃	[3932.439, 3946.045]	L₄	[4095.076, 4109.245]
L₅	[4295.978, 4310.841]	L₆	[4540.064, 4555.772]	L₇	[4556.821, 4572.587]	L₈	[4573.640, 4589.464]
L₉	[4658.671, 4674.789]	L₁₀	[4833.503, 4850.226]	L₁₁	[4851.343, 4868.128]	L₁₂	[4869.249, 4886.096]
L₁₃	[4887.221, 4904.130]	L₁₄	[4923.365, 4940.399]	L₁₅	[5164.901, 5182.771]	L₁₆	[5183.964, 5201.900]
L₁₇	[5222.302, 5240.371]	L₁₈	[5241.577, 5259.712]	L₁₉	[5280.341, 5298.611]	L₂₀	[5299.831, 5318.167]
L₂₁	[5319.392, 5337.796]	L₂₂	[5418.287, 5437.033]	L₂₃	[5498.725, 5517.750]	L₂₄	[6562.389, 6585.094]
(c) Extracted features for estimating [Fe/H].
F₁	[3932.439, 3946.045]	F₂	[3990.819, 4004.626]	F₃	[4005.549, 4019.407]	F₄	[4020.333, 4034.242]
F₅	[4035.172, 4049.133]	F₆	[4506.735, 4522.327]	F₇	[4607.465, 4623.406]	F₈	[4745.282, 4761.700]
F₉	[4780.375, 4796.915]	F₁₀	[4798.019, 4814.620]	F₁₁	[4815.729, 4832.390]	F₁₂	[4851.343, 4868.128]
F₁₃	[4869.249, 4886.096]	F₁₄	[4941.536, 4958.633]	F₁₅	[4959.775, 4976.935]	F₁₆	[5051.984, 5069.463]
F₁₇	[5108.131, 5125.804]	F₁₈	[5241.577, 5259.712]	F₁₉	[5260.924, 5279.126]	F₂₀	[5280.341, 5298.611]
F₂₁	[5299.831, 5318.167]	F₂₂	[5398.362, 5417.039]	F₂₃	[5438.285, 5457.101]	F₂₄	[8524.364, 8553.857]
F₂₅	[8650.914, 8680.845]

Table 1.

Extracted features for estimating atmospheric parameters. WP is the wavelength position represented by a two-dimensional vector [a, b], where a and b are the starting and ending wavelengths (Å).

Label	WP (Å)	Label	WP (Å)	Label	WP (Å)	Label	WP (Å)
(a) Extracted features for estimating T_eff.
T₁	[3932.439, 3946.045]	T₂	[4217.567, 4232.159]	T₃	[4780.375, 4796.915]	T₄	[4851.343, 4868.128]
T₅	[5033.406, 5050.821]	T₆	[5070.631, 5088.175]	T₇	[5108.131, 5125.804]	T₈	[5126.984, 5144.723]
T₉	[5145.908, 5163.712]	T₁₀	[5164.901, 5182.771]	T₁₁	[5203.098, 5221.100]	T₁₂	[6562.389, 6585.094]
T₁₃	[8524.364, 8553.857]	T₁₄	[8650.914, 8680.845]	T₁₅	[8747.058, 8777.321]	T₁₆	[8844.270, 8874.870]
T₁₇	[9008.697, 9039.866]
(b) Extracted features for estimating log g.
L₁	[3818.229, 3831.440]	L₂	[3889.215, 3902.672]	L₃	[3932.439, 3946.045]	L₄	[4095.076, 4109.245]
L₅	[4295.978, 4310.841]	L₆	[4540.064, 4555.772]	L₇	[4556.821, 4572.587]	L₈	[4573.640, 4589.464]
L₉	[4658.671, 4674.789]	L₁₀	[4833.503, 4850.226]	L₁₁	[4851.343, 4868.128]	L₁₂	[4869.249, 4886.096]
L₁₃	[4887.221, 4904.130]	L₁₄	[4923.365, 4940.399]	L₁₅	[5164.901, 5182.771]	L₁₆	[5183.964, 5201.900]
L₁₇	[5222.302, 5240.371]	L₁₈	[5241.577, 5259.712]	L₁₉	[5280.341, 5298.611]	L₂₀	[5299.831, 5318.167]
L₂₁	[5319.392, 5337.796]	L₂₂	[5418.287, 5437.033]	L₂₃	[5498.725, 5517.750]	L₂₄	[6562.389, 6585.094]
(c) Extracted features for estimating [Fe/H].
F₁	[3932.439, 3946.045]	F₂	[3990.819, 4004.626]	F₃	[4005.549, 4019.407]	F₄	[4020.333, 4034.242]
F₅	[4035.172, 4049.133]	F₆	[4506.735, 4522.327]	F₇	[4607.465, 4623.406]	F₈	[4745.282, 4761.700]
F₉	[4780.375, 4796.915]	F₁₀	[4798.019, 4814.620]	F₁₁	[4815.729, 4832.390]	F₁₂	[4851.343, 4868.128]
F₁₃	[4869.249, 4886.096]	F₁₄	[4941.536, 4958.633]	F₁₅	[4959.775, 4976.935]	F₁₆	[5051.984, 5069.463]
F₁₇	[5108.131, 5125.804]	F₁₈	[5241.577, 5259.712]	F₁₉	[5260.924, 5279.126]	F₂₀	[5280.341, 5298.611]
F₂₁	[5299.831, 5318.167]	F₂₂	[5398.362, 5417.039]	F₂₃	[5438.285, 5457.101]	F₂₄	[8524.364, 8553.857]
F₂₅	[8650.914, 8680.845]

Label	WP (Å)	Label	WP (Å)	Label	WP (Å)	Label	WP (Å)
(a) Extracted features for estimating T_eff.
T₁	[3932.439, 3946.045]	T₂	[4217.567, 4232.159]	T₃	[4780.375, 4796.915]	T₄	[4851.343, 4868.128]
T₅	[5033.406, 5050.821]	T₆	[5070.631, 5088.175]	T₇	[5108.131, 5125.804]	T₈	[5126.984, 5144.723]
T₉	[5145.908, 5163.712]	T₁₀	[5164.901, 5182.771]	T₁₁	[5203.098, 5221.100]	T₁₂	[6562.389, 6585.094]
T₁₃	[8524.364, 8553.857]	T₁₄	[8650.914, 8680.845]	T₁₅	[8747.058, 8777.321]	T₁₆	[8844.270, 8874.870]
T₁₇	[9008.697, 9039.866]
(b) Extracted features for estimating log g.
L₁	[3818.229, 3831.440]	L₂	[3889.215, 3902.672]	L₃	[3932.439, 3946.045]	L₄	[4095.076, 4109.245]
L₅	[4295.978, 4310.841]	L₆	[4540.064, 4555.772]	L₇	[4556.821, 4572.587]	L₈	[4573.640, 4589.464]
L₉	[4658.671, 4674.789]	L₁₀	[4833.503, 4850.226]	L₁₁	[4851.343, 4868.128]	L₁₂	[4869.249, 4886.096]
L₁₃	[4887.221, 4904.130]	L₁₄	[4923.365, 4940.399]	L₁₅	[5164.901, 5182.771]	L₁₆	[5183.964, 5201.900]
L₁₇	[5222.302, 5240.371]	L₁₈	[5241.577, 5259.712]	L₁₉	[5280.341, 5298.611]	L₂₀	[5299.831, 5318.167]
L₂₁	[5319.392, 5337.796]	L₂₂	[5418.287, 5437.033]	L₂₃	[5498.725, 5517.750]	L₂₄	[6562.389, 6585.094]
(c) Extracted features for estimating [Fe/H].
F₁	[3932.439, 3946.045]	F₂	[3990.819, 4004.626]	F₃	[4005.549, 4019.407]	F₄	[4020.333, 4034.242]
F₅	[4035.172, 4049.133]	F₆	[4506.735, 4522.327]	F₇	[4607.465, 4623.406]	F₈	[4745.282, 4761.700]
F₉	[4780.375, 4796.915]	F₁₀	[4798.019, 4814.620]	F₁₁	[4815.729, 4832.390]	F₁₂	[4851.343, 4868.128]
F₁₃	[4869.249, 4886.096]	F₁₄	[4941.536, 4958.633]	F₁₅	[4959.775, 4976.935]	F₁₆	[5051.984, 5069.463]
F₁₇	[5108.131, 5125.804]	F₁₈	[5241.577, 5259.712]	F₁₉	[5260.924, 5279.126]	F₂₀	[5280.341, 5298.611]
F₂₁	[5299.831, 5318.167]	F₂₂	[5398.362, 5417.039]	F₂₃	[5438.285, 5457.101]	F₂₄	[8524.364, 8553.857]
F₂₅	[8650.914, 8680.845]

EXPERIMENTS AND DISCUSSION

Performance for SDSS spectra

From the detected features in Table 1, a spectrum parametrization model can be learned from the training set |$S_{\rm tr}^{\rm SD}$| (Section 2.1). The performance obtained using the test set |$S_{\rm te}^{\rm SD}$| is presented in Table 2 .

Table 2.

Performance of the proposed scheme on 40 000 test spectra from SDSS (10 000 SDSS spectra for training, Section 2.1).

Method	log T_eff (T_eff)			log g			[Fe/H]
Parameter	MAE	ME	SD	MAE	ME	SD	MAE	ME	SD
RBFNN	0.0065(88.48)	4.42 × 10⁻⁴(6.28)	0.0107(148.04)	0.2159	0.0205	0.3228	0.1547	6.04 × 10⁻⁴	0.2197
SVR_G	0.0062(85.83)	6.05 × 10⁻⁴(9.40)	0.0101(146.66)	0.2035	−0.0193	0.3053	0.1512	1.19 × 10⁻²	0.2158
KNNR	0.0069(94.77)	−8.39 × 10⁻⁴(−10.13)	0.0109(154.62)	0.2178	−0.0370	0.3069	0.2198	−3.56 × 10⁻²	0.2999
LSR	0.0072(99.22)	3.45 × 10⁻⁴(5.46)	0.0111(160.73)	0.2594	0.0270	0.3574	0.1786	3.61 × 10⁻³	0.2472
SVR_l	0.0070(96.77)	3.41 × 10⁻⁴(7.11)	0.0111(162.79)	0.2417	0.0475	0.3648	0.1758	−8.62 × 10⁻³	0.2466

Method	log T_eff (T_eff)			log g			[Fe/H]
Parameter	MAE	ME	SD	MAE	ME	SD	MAE	ME	SD
RBFNN	0.0065(88.48)	4.42 × 10⁻⁴(6.28)	0.0107(148.04)	0.2159	0.0205	0.3228	0.1547	6.04 × 10⁻⁴	0.2197
SVR_G	0.0062(85.83)	6.05 × 10⁻⁴(9.40)	0.0101(146.66)	0.2035	−0.0193	0.3053	0.1512	1.19 × 10⁻²	0.2158
KNNR	0.0069(94.77)	−8.39 × 10⁻⁴(−10.13)	0.0109(154.62)	0.2178	−0.0370	0.3069	0.2198	−3.56 × 10⁻²	0.2999
LSR	0.0072(99.22)	3.45 × 10⁻⁴(5.46)	0.0111(160.73)	0.2594	0.0270	0.3574	0.1786	3.61 × 10⁻³	0.2472
SVR_l	0.0070(96.77)	3.41 × 10⁻⁴(7.11)	0.0111(162.79)	0.2417	0.0475	0.3648	0.1758	−8.62 × 10⁻³	0.2466

Notes. The unit for T_eff is K; The unit for log T_eff is log(K).

Table 2.

Performance of the proposed scheme on 40 000 test spectra from SDSS (10 000 SDSS spectra for training, Section 2.1).

Method	log T_eff (T_eff)			log g			[Fe/H]
Parameter	MAE	ME	SD	MAE	ME	SD	MAE	ME	SD
RBFNN	0.0065(88.48)	4.42 × 10⁻⁴(6.28)	0.0107(148.04)	0.2159	0.0205	0.3228	0.1547	6.04 × 10⁻⁴	0.2197
SVR_G	0.0062(85.83)	6.05 × 10⁻⁴(9.40)	0.0101(146.66)	0.2035	−0.0193	0.3053	0.1512	1.19 × 10⁻²	0.2158
KNNR	0.0069(94.77)	−8.39 × 10⁻⁴(−10.13)	0.0109(154.62)	0.2178	−0.0370	0.3069	0.2198	−3.56 × 10⁻²	0.2999
LSR	0.0072(99.22)	3.45 × 10⁻⁴(5.46)	0.0111(160.73)	0.2594	0.0270	0.3574	0.1786	3.61 × 10⁻³	0.2472
SVR_l	0.0070(96.77)	3.41 × 10⁻⁴(7.11)	0.0111(162.79)	0.2417	0.0475	0.3648	0.1758	−8.62 × 10⁻³	0.2466

Method	log T_eff (T_eff)			log g			[Fe/H]
Parameter	MAE	ME	SD	MAE	ME	SD	MAE	ME	SD
RBFNN	0.0065(88.48)	4.42 × 10⁻⁴(6.28)	0.0107(148.04)	0.2159	0.0205	0.3228	0.1547	6.04 × 10⁻⁴	0.2197
SVR_G	0.0062(85.83)	6.05 × 10⁻⁴(9.40)	0.0101(146.66)	0.2035	−0.0193	0.3053	0.1512	1.19 × 10⁻²	0.2158
KNNR	0.0069(94.77)	−8.39 × 10⁻⁴(−10.13)	0.0109(154.62)	0.2178	−0.0370	0.3069	0.2198	−3.56 × 10⁻²	0.2999
LSR	0.0072(99.22)	3.45 × 10⁻⁴(5.46)	0.0111(160.73)	0.2594	0.0270	0.3574	0.1786	3.61 × 10⁻³	0.2472
SVR_l	0.0070(96.77)	3.41 × 10⁻⁴(7.11)	0.0111(162.79)	0.2417	0.0475	0.3648	0.1758	−8.62 × 10⁻³	0.2466

Notes. The unit for T_eff is K; The unit for log T_eff is log(K).

For the SDSS test set, MAE errors are 0.0062, 0.2035 and 0.1512 dex for log T_eff, log g and [Fe/H], respectively. To compare the proposed scheme with those in previous related reports, the performance of the proposed scheme was also evaluated using measures ME and SD. More experimental results are presented in Table 2, as well as in Figs 1 and 2. Direct comparisons with reports are shown in Section 6 and more discussion on the dispersion in Fig. 1 is presented in Section 5.4.

Figure 1.

Performance of the proposed scheme on 40 000 test spectra from SDSS (10 000 SDSS spectra for training, Section 2.1) using SVR_G.

Figure 2.

Residual distributions of the proposed scheme for 40 000 test spectra from SDSS (10 000 SDSS spectra for training, Section 2.1).

Performance for LAMOST spectra and synthetic spectra

The proposed scheme is also tested on actual spectra from LAMOST (Section 2.2) and synthetic spectra (Section 2.3).

The performances of the scheme on LAMOST spectra is presented in Table 3, as well as in Figs 3 and 4. The test results using synthetic spectra are presented in Table 4, as well as in Figs 5 and 6. The results in Tables 2, 3 and 4 show that SVR_G and RBFNN are more suitable than KNNR, LSR and SVR_l for estimating atmospheric parameters.

Figure 3.

Performance of the proposed scheme on 23 963 test spectra from LAMOST (10 000 LAMOST spectra for training, Section 2.2) using SVR_G.

Figure 4.

Residual distributions of the proposed scheme for 23 963 test spectra from LAMOST (10000 LAMOST spectra for training, Section 2.2) using SVR_G.

Figure 5.

Performance of the proposed scheme on 10 469 test spectra computed from Kurucz's model (8500 synthetic spectra for training, Section 2.3) using SVR_G.

Figure 6.

Residual distributions of the proposed scheme for 10 469 test spectra computed from Kurucz's model (8500 synthetic spectra for training, Section 2.3) using SVR_G.

Table 3.

Performance of the proposed scheme on 23 963 test spectra from LAMOST (10 000 LAMOST spectra for training, Section 2.2).

Method	log T_eff (T_eff)			log g			[Fe/H]
Parameter	MAE	ME	SD	MAE	ME	SD	MAE	ME	SD
RBFNN	0.0070(91.14)	6.75 × 10⁻⁵(2.37)	0.0099(131.36)	0.1664	0.0109	0.2753	0.1197	−0.0038	0.1767
SVR_G	0.0074(95.37)	1.27 × 10⁻⁴(4.30)	0.0106(141.62)	0.1528	−0.0008	0.2102	0.1146	−0.0112	0.1528
KNNR	0.0085(111.80)	4.47 × 10⁻⁴(10.08)	0.0126(173.66)	0.1934	0.0167	0.2730	0.1625	−0.0154	0.2151
LSR	0.0082(106.51)	6.67 × 10⁻⁴(11.46)	0.0117(161.83)	0.2218	0.0173	0.3404	0.1312	−0.0143	0.1807
SVR_l	0.0081(105.73)	1.89 × 10⁻⁴(5.44)	0.0116(159.83)	0.2070	−0.0124	0.3145	0.1311	−0.0222	0.1806

Method	log T_eff (T_eff)			log g			[Fe/H]
Parameter	MAE	ME	SD	MAE	ME	SD	MAE	ME	SD
RBFNN	0.0070(91.14)	6.75 × 10⁻⁵(2.37)	0.0099(131.36)	0.1664	0.0109	0.2753	0.1197	−0.0038	0.1767
SVR_G	0.0074(95.37)	1.27 × 10⁻⁴(4.30)	0.0106(141.62)	0.1528	−0.0008	0.2102	0.1146	−0.0112	0.1528
KNNR	0.0085(111.80)	4.47 × 10⁻⁴(10.08)	0.0126(173.66)	0.1934	0.0167	0.2730	0.1625	−0.0154	0.2151
LSR	0.0082(106.51)	6.67 × 10⁻⁴(11.46)	0.0117(161.83)	0.2218	0.0173	0.3404	0.1312	−0.0143	0.1807
SVR_l	0.0081(105.73)	1.89 × 10⁻⁴(5.44)	0.0116(159.83)	0.2070	−0.0124	0.3145	0.1311	−0.0222	0.1806

Notes. The unit for T_eff is K; The unit for log T_eff is log(K).

Table 3.

Performance of the proposed scheme on 23 963 test spectra from LAMOST (10 000 LAMOST spectra for training, Section 2.2).

Method	log T_eff (T_eff)			log g			[Fe/H]
Parameter	MAE	ME	SD	MAE	ME	SD	MAE	ME	SD
RBFNN	0.0070(91.14)	6.75 × 10⁻⁵(2.37)	0.0099(131.36)	0.1664	0.0109	0.2753	0.1197	−0.0038	0.1767
SVR_G	0.0074(95.37)	1.27 × 10⁻⁴(4.30)	0.0106(141.62)	0.1528	−0.0008	0.2102	0.1146	−0.0112	0.1528
KNNR	0.0085(111.80)	4.47 × 10⁻⁴(10.08)	0.0126(173.66)	0.1934	0.0167	0.2730	0.1625	−0.0154	0.2151
LSR	0.0082(106.51)	6.67 × 10⁻⁴(11.46)	0.0117(161.83)	0.2218	0.0173	0.3404	0.1312	−0.0143	0.1807
SVR_l	0.0081(105.73)	1.89 × 10⁻⁴(5.44)	0.0116(159.83)	0.2070	−0.0124	0.3145	0.1311	−0.0222	0.1806

Method	log T_eff (T_eff)			log g			[Fe/H]
Parameter	MAE	ME	SD	MAE	ME	SD	MAE	ME	SD
RBFNN	0.0070(91.14)	6.75 × 10⁻⁵(2.37)	0.0099(131.36)	0.1664	0.0109	0.2753	0.1197	−0.0038	0.1767
SVR_G	0.0074(95.37)	1.27 × 10⁻⁴(4.30)	0.0106(141.62)	0.1528	−0.0008	0.2102	0.1146	−0.0112	0.1528
KNNR	0.0085(111.80)	4.47 × 10⁻⁴(10.08)	0.0126(173.66)	0.1934	0.0167	0.2730	0.1625	−0.0154	0.2151
LSR	0.0082(106.51)	6.67 × 10⁻⁴(11.46)	0.0117(161.83)	0.2218	0.0173	0.3404	0.1312	−0.0143	0.1807
SVR_l	0.0081(105.73)	1.89 × 10⁻⁴(5.44)	0.0116(159.83)	0.2070	−0.0124	0.3145	0.1311	−0.0222	0.1806

Notes. The unit for T_eff is K; The unit for log T_eff is log(K).

Table 4.

Performance of the proposed scheme on 10 469 test spectra computed from Kurucz's model (8500 synthetic spectra for training, Section 2.3).

Method	log T_eff (T_eff)			log g			[Fe/H]
Parameter	MAE	ME	SD	MAE	ME	SD	MAE	ME	SD
RBFNN	0.0010(14.15)	1.34 × 10⁻⁴(1.42)	0.0014(20.26)	0.0217	1.89 × 10⁻³	0.0582	0.0203	1.95 × 10⁻³	0.0282
SVR_G	0.0010(14.42)	2.63 × 10⁻⁴(3.34)	0.0015(20.81)	0.0123	−9.42 × 10⁻⁴	0.0590	0.0125	−3.58 × 10⁻⁴	0.0256
KNNR	0.0027(39.39)	3.92 × 10⁻⁵(0.19)	0.0041(61.08)	0.2167	3.55 × 10⁻²	0.3166	0.1007	2.90 × 10⁻²	0.1611
LSR	0.0026(36.18)	−2.54 × 10⁻⁴(−3.68)	0.0033(46.05)	0.1416	2.36 × 10⁻²	0.1902	0.0903	9.77 × 10⁻³	0.1175
SVR_l	0.0025(34.91)	1.14 × 10⁻⁴(1.79)	0.0032(45.80)	0.1343	2.21 × 10⁻²	0.1920	0.0783	4.12 × 10⁻³	0.1122

Method	log T_eff (T_eff)			log g			[Fe/H]
Parameter	MAE	ME	SD	MAE	ME	SD	MAE	ME	SD
RBFNN	0.0010(14.15)	1.34 × 10⁻⁴(1.42)	0.0014(20.26)	0.0217	1.89 × 10⁻³	0.0582	0.0203	1.95 × 10⁻³	0.0282
SVR_G	0.0010(14.42)	2.63 × 10⁻⁴(3.34)	0.0015(20.81)	0.0123	−9.42 × 10⁻⁴	0.0590	0.0125	−3.58 × 10⁻⁴	0.0256
KNNR	0.0027(39.39)	3.92 × 10⁻⁵(0.19)	0.0041(61.08)	0.2167	3.55 × 10⁻²	0.3166	0.1007	2.90 × 10⁻²	0.1611
LSR	0.0026(36.18)	−2.54 × 10⁻⁴(−3.68)	0.0033(46.05)	0.1416	2.36 × 10⁻²	0.1902	0.0903	9.77 × 10⁻³	0.1175
SVR_l	0.0025(34.91)	1.14 × 10⁻⁴(1.79)	0.0032(45.80)	0.1343	2.21 × 10⁻²	0.1920	0.0783	4.12 × 10⁻³	0.1122

Notes. The unit for T_eff is K; The unit for log T_eff is log(K).

Table 4.

Performance of the proposed scheme on 10 469 test spectra computed from Kurucz's model (8500 synthetic spectra for training, Section 2.3).

Method	log T_eff (T_eff)			log g			[Fe/H]
Parameter	MAE	ME	SD	MAE	ME	SD	MAE	ME	SD
RBFNN	0.0010(14.15)	1.34 × 10⁻⁴(1.42)	0.0014(20.26)	0.0217	1.89 × 10⁻³	0.0582	0.0203	1.95 × 10⁻³	0.0282
SVR_G	0.0010(14.42)	2.63 × 10⁻⁴(3.34)	0.0015(20.81)	0.0123	−9.42 × 10⁻⁴	0.0590	0.0125	−3.58 × 10⁻⁴	0.0256
KNNR	0.0027(39.39)	3.92 × 10⁻⁵(0.19)	0.0041(61.08)	0.2167	3.55 × 10⁻²	0.3166	0.1007	2.90 × 10⁻²	0.1611
LSR	0.0026(36.18)	−2.54 × 10⁻⁴(−3.68)	0.0033(46.05)	0.1416	2.36 × 10⁻²	0.1902	0.0903	9.77 × 10⁻³	0.1175
SVR_l	0.0025(34.91)	1.14 × 10⁻⁴(1.79)	0.0032(45.80)	0.1343	2.21 × 10⁻²	0.1920	0.0783	4.12 × 10⁻³	0.1122

Method	log T_eff (T_eff)			log g			[Fe/H]
Parameter	MAE	ME	SD	MAE	ME	SD	MAE	ME	SD
RBFNN	0.0010(14.15)	1.34 × 10⁻⁴(1.42)	0.0014(20.26)	0.0217	1.89 × 10⁻³	0.0582	0.0203	1.95 × 10⁻³	0.0282
SVR_G	0.0010(14.42)	2.63 × 10⁻⁴(3.34)	0.0015(20.81)	0.0123	−9.42 × 10⁻⁴	0.0590	0.0125	−3.58 × 10⁻⁴	0.0256
KNNR	0.0027(39.39)	3.92 × 10⁻⁵(0.19)	0.0041(61.08)	0.2167	3.55 × 10⁻²	0.3166	0.1007	2.90 × 10⁻²	0.1611
LSR	0.0026(36.18)	−2.54 × 10⁻⁴(−3.68)	0.0033(46.05)	0.1416	2.36 × 10⁻²	0.1902	0.0903	9.77 × 10⁻³	0.1175
SVR_l	0.0025(34.91)	1.14 × 10⁻⁴(1.79)	0.0032(45.80)	0.1343	2.21 × 10⁻²	0.1920	0.0783	4.12 × 10⁻³	0.1122

Notes. The unit for T_eff is K; The unit for log T_eff is log(K).

Filtering and selection: positive or negative?

The proposed scheme extracts spectral features by removing high-frequency components from the Haar wavelet transform and rejecting most low-frequency components by the LASSO algorithm. In this study, we determine whether these processes eliminate important spectral information, such as weak lines.

Four experiments are conducted and the results are listed in Table 5. The results show the possibility of eliminating useful spectral information. However, in application, the observed spectrum is inevitably contaminated with noise. In theory, weak lines should be more sensitive to noise.

Table 5.

In these experiments, the advantages and disadvantages of eliminating high-frequency as well as many low-frequency components are evaluated. The parameters are estimated by a support-vector machine (RBF kernel) and RBF neural network on SDSS samples. Their performances are assessed by MAE. WT(i, 0) and WT(i, 1) represent the coefficients of a wavelet transform with i-level decomposition in the approximation and high-frequency sub-bands. {T_i}, {L_i} and {F_i} denote the extracted features for log T_eff, log g and [Fe/H], respectively. The number after the colon represents the total number of features utilized.

log T_eff			log g			[Fe/H]
Features	SVR_G	RBFNN	Features	SVR	RBFNN	Features	SVR_G	RBFNN
{T_i}:17	0.0062	0.0065	{L_i}:24	0.2035	0.2159	{F_i}:25	0.1512	0.1547
WT(4,0):239	0.0055	0.0062	WT(4,0):239	0.1909	0.2267	WT(4,0):239	0.1311	0.1486
WT(4,1)+{T_i}:256	0.0165	0.0083	WT(4,1)+{L_i}:263	0.2368	0.2449	WT(4,1)+{F_i}:264	0.1862	0.1770
Full:3823	0.0460	0.0131	Full:3823	0.3726	0.2366	Full:3823	0.4118	0.1769

log T_eff			log g			[Fe/H]
Features	SVR_G	RBFNN	Features	SVR	RBFNN	Features	SVR_G	RBFNN
{T_i}:17	0.0062	0.0065	{L_i}:24	0.2035	0.2159	{F_i}:25	0.1512	0.1547
WT(4,0):239	0.0055	0.0062	WT(4,0):239	0.1909	0.2267	WT(4,0):239	0.1311	0.1486
WT(4,1)+{T_i}:256	0.0165	0.0083	WT(4,1)+{L_i}:263	0.2368	0.2449	WT(4,1)+{F_i}:264	0.1862	0.1770
Full:3823	0.0460	0.0131	Full:3823	0.3726	0.2366	Full:3823	0.4118	0.1769

Table 5.

In these experiments, the advantages and disadvantages of eliminating high-frequency as well as many low-frequency components are evaluated. The parameters are estimated by a support-vector machine (RBF kernel) and RBF neural network on SDSS samples. Their performances are assessed by MAE. WT(i, 0) and WT(i, 1) represent the coefficients of a wavelet transform with i-level decomposition in the approximation and high-frequency sub-bands. {T_i}, {L_i} and {F_i} denote the extracted features for log T_eff, log g and [Fe/H], respectively. The number after the colon represents the total number of features utilized.

log T_eff			log g			[Fe/H]
Features	SVR_G	RBFNN	Features	SVR	RBFNN	Features	SVR_G	RBFNN
{T_i}:17	0.0062	0.0065	{L_i}:24	0.2035	0.2159	{F_i}:25	0.1512	0.1547
WT(4,0):239	0.0055	0.0062	WT(4,0):239	0.1909	0.2267	WT(4,0):239	0.1311	0.1486
WT(4,1)+{T_i}:256	0.0165	0.0083	WT(4,1)+{L_i}:263	0.2368	0.2449	WT(4,1)+{F_i}:264	0.1862	0.1770
Full:3823	0.0460	0.0131	Full:3823	0.3726	0.2366	Full:3823	0.4118	0.1769

log T_eff			log g			[Fe/H]
Features	SVR_G	RBFNN	Features	SVR	RBFNN	Features	SVR_G	RBFNN
{T_i}:17	0.0062	0.0065	{L_i}:24	0.2035	0.2159	{F_i}:25	0.1512	0.1547
WT(4,0):239	0.0055	0.0062	WT(4,0):239	0.1909	0.2267	WT(4,0):239	0.1311	0.1486
WT(4,1)+{T_i}:256	0.0165	0.0083	WT(4,1)+{L_i}:263	0.2368	0.2449	WT(4,1)+{F_i}:264	0.1862	0.1770
Full:3823	0.0460	0.0131	Full:3823	0.3726	0.2366	Full:3823	0.4118	0.1769

Therefore, the loss from elimination is trivial. The wavelet components with the lowest frequency are traditional choices for spectral features for estimating atmospheric parameters (Lu et al. 2013). In the experiments for T_eff, when all low-frequency wavelet components are used, the number of features will increase from 17 to 239 (an increase of (239 − 17)/17 = 1305.88 per cent), but the MAE can only decrease by 0.0007 dex (11.29 per cent: Experiments 1 and 2). When no component is eliminated while estimating T_eff, the number of features will increase from 17 to 3823 (an increase of 22 388.23 per cent), but the MAE error increases by 0.0398 dex (641.93 per cent). A small number of detected features indicates an efficient process for estimating atmospheric parameters from spectral features. The above results suggest that the model developed in this work can estimate stellar atmospheric parameters with high accuracy.

Knowledge base, dispersion and performance

The proposed scheme is a statistical learning method. Its primary principle is to discover automatically the mapping from a stellar spectrum to its atmospheric parameters from a training set. The training set is the carrier of knowledge and affects the accuracy of the scheme.

Therefore, the size of the training set affects the performance of the proposed scheme. For example, if the size of the training set is increased from 10 000 to 15 000, 20 000 and 25 000 in the experiments on SDSS spectra,³ the test dispersion can clearly be improved (Fig. 7). Similar experiments are conducted on synthetic spectra and the corresponding results are presented in Fig. 8.

Figure 7.

Dispersion can be improved by increasing the size of the training set from 10 000 to 15 000, 20 000 and 25 000 in the experiments on SDSS test spectra. This experiment is conducted using SVR_G.

Figure 8.

Dispersion can be improved by increasing the size of the training set from 8500 to 10 469 in the experiment on synthetic test spectra. This experiment is conducted using SVR_G.

http://www.appstate.edu/∼grayro/spectrum/spectrum.html

Actual data usually present some disturbances arising from noise and pre-processing imperfections (e.g. sky lines and/or cosmic-ray removal residuals, residual calibration defects and interstellar extinction instability⁴). The negative effect from these factors can be reduced to a certain extent by enriching the knowledge carrier, i.e. the training set (Figs 7 and 8).

Compactness

For simplicity, this section considers the features of log g as an example to discuss compactness. Other features of T_eff and [Fe/H] can be discussed similarly.

The original SDSS spectra are described by 3821 fluxes. To estimate log g, 24 features are detected and data reduction is (3821 − 24)/(3821) ≈ 99.37 per cent. This result indicates that log g can be estimated from a spectral description with a dimension of 24 instead of 3821. The small number of features also implies high efficiency in estimating the parameter from spectral information.

Compactness indicates the study of whether the number of features can be reduced further. Experimental results show that if seven features, namely L1, L8, L13, L19, L20, L21 and L23, are rejected, the feature number decreases by 7/24 ≈ 29 per cent and the MAE error increases by 0.0027 dex (approximately 0.0027/0.2035 ≈ 1.32 per cent).⁵ Therefore, the feature number can be refined further if a slight decrease in accuracy is accepted.

CONCLUSION AND FUTURE WORK

This work investigated the estimation of effective temperature (T_eff), surface gravity (log g) and metallicity ([Fe/H]) from stellar spectra based on the Haar wavelet transform and LASSO algorithm. The proposed scheme is evaluated using actual spectra from SDSS and LAMOST as well as synthetic spectra computed from Kurucz's model. Favourable results are achieved in all cases.

The proposed scheme exhibits excellent robustness and sparseness. The features are extracted using two steps. From the SDSS data, the original spectra, described by 3821 fluxes, are initially decomposed using the Haar wavelet transform and low-frequency coefficients (239 features) are selected as candidate features. In this step, some noise and redundancies with high-frequency components are removed. Then a small subset of candidate features are chosen as spectral features using the LASSO algorithm. The second step is a supervised learning process that selects features according to their correlation with the parameter to be estimated. The number of selected features is 17 for T_eff, 24 for log g and 25 for [Fe/H]. A representative work is Re Fiorentin et al. (2007), in which 50 features are extracted for estimating atmospheric parameters.

Another advantage of the proposed scheme is its high accuracy. Using the SVR_G method and 40 000 stellar spectra from SDSS, the MAEs are 0.0062 dex for log T_eff (85.83 K for T_eff), 0.2035 dex for log g and 0.1512 dex for [Fe/H]. Further details are shown in Table 2. In previous reports, Re Fiorentin et al. (2007) estimated the parameters of 19 000 spectra from SDSS with MAEs of 0.0126 dex for log T_eff, 0.3644 dex for log g and 0.1949 dex for [Fe/H]. Jofré et al. (2010) first highly compressed the data using a likelihood method and then estimated the parameters T_eff, log g and [Fe/H] from low-resolution stellar spectra measured by SEGUE; the standard deviations (SDs) of the errors were 130 K for T_eff, 0.5 dex for log g and 0.25 dex for [Fe/H]. Therefore, the results estimated using the proposed scheme exhibit higher accuracy compared with those reported in the literature.

In this work, the proposed scheme is evaluated using three different data sets (SDSS, LAMOST and synthetic spectra). For each kind of data, the proposed model is learned and tested independently. An interesting problem is the estimation of atmospheric parameters of one data set (e.g. LAMOST) using a model learned from the other data sets (e.g. SDSS or synthetic spectra). For example, Re Fiorentin et al. (2007) investigated how to estimate the atmospheric parameters of SDSS data from synthetic spectra and vice versa. In this process, residual calibration defects should be considered. This work focuses on sparse feature extraction and the abovementioned problem will be investigated in future work.

The authors thank the reviewer and editor for their instructive comments and extend their thanks to Professor Ali Luo and Fang Zuo for their support and discussions. This work is supported by the National Natural Science Foundation of China (grant No: 61273248, 61075033, 61202315), the Natural Science Foundation of Guangdong Province (2014A030313425,S2011010003348), the Open Project Program of the National Laboratory of Pattern Recognition (NLPR) (201001060) and the high-performance computing platform of South China Normal University.

1

This constraint was not used on SDSS data.

2

3

Correspondingly, the size of the test set decreases from 40 000 to 35 000, 30 000 and 25 000, respectively.

4

By instability, we mean that a slight difference in the interstellar extinction of multiple stars may be observed.

5

These experiments are conducted on SDSS spectra (Section 2.1).

REFERENCES

Abazajian

K. N.

et al.

ApJS

2009

182

543

Ahn

C. P.

et al.

ApJS

2012

203

21

Allende Prieto

C.

et al.

AJ

2008

136

2070

Altman

N. S.

Amer. Stat.

1992

46

175

Beers

T. C.

et al.

Mem. Soc. Astron. Ital.

2006

77

1171

Bu

Y.

Pan

J.

MNRAS

2015

447

256

Castelli

F.

Kurucz

R. L.

Piskunov

N. E.

Weiss

W. W.

Gray

D. F.

IAU Symp. 210, Modelling of Stellar Atmospheres

2003

Cambridge

Cambridge Univ. Press

A20

Chang

C. C.

Lin

C. J.

LIBSVM: A Library for Support Vector Machines

2001

Software, available at: http://www.csie.ntu.edu.tw/cjlin/libsvm

Cui

X.

et al.

Res. Astron. Astrophys.

2012

12

1197

Daubechies

I.

Ten Lectures on Wavelets

1992

Philadelphia

Society for Industrial and Applied Mathematics

Gilmore

G.

et al.

Messenger

2012

147

25

Gray

R. O.

Corbally

C. J.

AJ

1994

107

742

Grevesse

N.

Sauval

A. J.

Space Sci. Rev.

1998

85

161

James

G.

Witten

D.

Hastie

T.

Tibshirani

T.

An Introduction to Statistical Learning with Applications in R

2013

New York

Springer-Verlag

Jofré

P.

Panter

B.

Hansen

C. J.

Weiss

A.

A&A

2010

517

A57

Koleva

M.

Prugniel

P.

Bouchard

A.

Wu

Y.

A&A

2009

501

1269

Lee

Y. S.

et al.

AJ

2008a

136

2022

Lee

Y. S.

et al.

AJ

2008b

136

2050

Lee

Y. S.

et al.

AJ

2011

141

90

Li

X.

Wu

Q. M. J.

Luo

A.

Zhao

Y.

Lu

Y.

Zuo

F.

Yang

T.

Wang

Y

ApJ

2014

790

105

Lu

Y.

Li

X.

Wang

Y.

Yang

T.

Spectrosc. Spectral Anal.

2013

33

2010

Luo

A.

et al.

Res. Astron. Astrophys.

2012

12

1243

Mallat

S.

A Wavelet Tour of Signal Processing

2009

3rd edn

Boston

Academic Press

Manteiga

M.

Ordóñez

D.

Dafonte

C.

Arcay

B.

PASP

2010

122

608

Muirhead

P. S.

Hamren

K.

Schlawin

E.

Rojas-Ayala

B.

Covey

K. R.

Lloyd

J. P.

ApJ

2012

750

L37

Randich

S.

Gilmore

G.

Gaia–ESO Consortium

Messenger

2013

154

47

Re Fiorentin

P.

Bailer-Jones

C. A. L.

Lee

Y. S.

Beers

T. C.

Sivarani

T.

Wilhelm

R.

Allende Prieto

C.

Norris

J. E.

A&A

2007

467

1373

Schölkopf

B.

Smola

A. J.

Learning with Kernels

2002

Cambridge, MA

MIT Press

Schwenker

F.

Kestler

H. A.

Palm

G.

Neural Networks

2001

14

439

PubMed

Sjöstrand

K.

Matlab Implementation of LASSO, LARS, the Elastic Net and SPCA (Version 2.0), DTU 2005.6

2005

Informatics and Mathematical Modelling

Technical University of Denmark

Lyngby, available at: http://www2.imm.dtu.dk/pubdb/p.php?3897

Smola

A. J.

Schölkopf

B.

Stat. Comput.

2004

14

199

Smolinski

J. P.

et al.

AJ

2011

141

89

Song

Y.

et al.

Res. Astron. Astrophys.

2012

12

453

Tan

X.

Pan

J.

Wang

J.

Luo

A.

Tu

L.

Spectrosc. Spectral Anal.

2013

33

1397

Tibshirani

R.

J. R. Stat. Soc. B

1996

58

267

Wu

Y.

et al.

Res. Astron. Astrophys.

2011

11

924

Yanny

B.

et al.

AJ

2009

137

4377

York

D. G.

et al.

AJ

2000

120

1579

Zhao

G.

Chen

Y.

Shi

J.

Liang

Y.

Hou

J.

Chen

L.

Zhang

H.

Li

A.

Chin. J. Astron. Astrophys.

2006

6

265