Our Galaxy’s youngest disc

ABSTRACT

We investigate the structure of our Galaxy’s young stellar disc by fitting the distribution functions (DFs) of a new family to 5D Gaia data for a sample of |$47\, 000$| OB stars. Tests of the fitting procedure show that the young disc’s DF would be strongly constrained by Gaia data if the distribution of Galactic dust were accurately known. The DF that best fits the real data accurately predicts the kinematics of stars at their observed locations, but it predicts the spatial distribution of stars poorly, almost certainly on account of errors in the best-available dust map. We argue that dust models could be greatly improved by modifying the dust model until the spatial distribution of stars predicted by a DF agreed with the data. The surface density of OB stars is predicted to peak at |$R\simeq 5.5\, \mathrm{kpc}$|⁠, slightly outside the reported peak in the surface density of molecular gas; we suggest that the latter radius may have been underestimated through the use of poor kinematic distances. The velocity distributions predicted by the best-fitting DF for stars with measured line-of-sight velocities v_∥ reveal that the outer disc is disturbed at the level of |$10\, \mathrm{km\, s}^{-1}$| in agreement with earlier studies, and that the measured values of v_∥ have significant contributions from the orbital velocities of binaries. Hence the outer disc is colder than it is sometimes reported to be.

Galaxy: disc, Galaxy: kinematics and dynamics, Galaxy: structure

1 INTRODUCTION

Over the last two decades our Galaxy has been the target of intense observational activity not only on account of its interest as our home but also because of its cosmological significance: it is a uniquely accessible example of the type of galaxy that currently dominates the cosmic star-formation rate. Data from massive photometric (Schmidt et al. 2005; Skrutskie et al. 2006; Kaiser et al. 2010) spectroscopic (Steinmetz et al. 2006; De Silva et al. 2015; Majewski et al. 2017) and astrometric (Gaia Collaboration 2016) surveys are now in hand and we need to synthesize these data into a coherent physical picture of our archetypal Galaxy.

The stellar distribution of our Galaxy is dominated by the disc. Over the last half century, it has become conventional to decompose the disc into several components. On the largest scale, Gilmore & Reid (1983) pointed out that the disc is split into thin and thick components. More recently, in light of spectroscopic data rather than star-counts, the view is gaining ground that the fundamental division is between a disc of very old stars with [α/Fe] larger than the Sun and a continuously forming disc of stars with ‘normal’ values of [α/Fe] (Hayden et al. 2015; Bland-Hawthorn et al. 2019). The former, very old disc has a scale height |$z_0\gt 0.7\, \mathrm{kpc}$| at all radii, while the latter disc has a much smaller |$(\sim 0.3\, \mathrm{kpc}$|⁠) scale height, except possibly beyond the solar radius R₀, where it probably flares. The thin, or ‘α-normal’ disc is naturally divided into sub-discs comprising stars of similar ages because the α-normal disc has formed continuously over |$8-10\, \mathrm{Gyr}$| and older stellar cohorts now have larger random velocities and vertical scale heights. It is also thought that older cohorts have smaller radial scale lengths.

Recently (Li & Binney 2022), we investigated the structure of our Galaxy’s stellar halo by modelling the distribution of stars identified as RR-Lyrae variables in data from the Pan-STARRS survey (Kaiser et al. 2010). In that paper, we developed a technique for modelling a Galactic component using the 5D data for stars that is available in enormous quantities from the Gaia mission (Gaia Collaboration 2021, and references therein). In this paper, we apply this technique to objects that are likely OB stars, which may be considered tracers of the youngest cohort in the disc. The structure of this component is intrinsically of great interest, but modelling it forces one to engage with problems that do not arise when modelling the RR-Lyrae population.

We model a component by fitting to the data a parametrized distribution function (DF) f(J) that is a function of the action integrals within a given model of the Galaxy’s gravitational potential. This potential itself emerges from fitting DFs for several components, both stellar and dark, to 6D Gaia data (Binney & Vasiliev 2022, hereafter BV2022). The potential is that co-operatively generated by the DFs of dark matter and all the major stellar components together with a gas disc of pre-determined structure. The models are constructed and analysed using the agama software library (Vasiliev 2019).

When we lack data for one of the six phase-space coordinates, as we do for nearly all the ∼1.3 billion stars monitored by Gaia, fitting a DF to data is more computationally demanding because one has marginalize over the unknown line-of-sight velocity v_∥ of each star. Against this significant disadvantage may be set two substantial advantages: (i) in the Early Third Data Release (EDR3) from Gaia (Gaia Collaboration 2021) contains ∼1.3 × 10⁹ stars whereas Gaia’s Radial Velocity Sample RVS (Gaia Collaboration 2018) contains only ∼7 × 10⁶ stars, and (ii) while the selection function (SF) relevant to 5D data is complex, it is much better known than that of the RVS (Boubert et al. 2021; Everall et al. 2021). Hence, when using 5D data significance can be attached to the density of stars in real space in a way that is not possible when using 6D data: in that case only densities in velocity space are significant.

The Galaxy’s young stellar disc is perhaps the hardest component to observe because it is largely confined to a thin layer around the Galactic plane and is therefore heavily obscured, even fairly close to the Sun. We will find that our limited knowledge of the distribution of dust through the disc severely limits our ability to determine the structure of the young disc.

This paper is structured as follows. Section 2 explains how we extracted a sample of OB stars from Gaia EDR3. In Section 3, we outline the approach to modelling 5D data that was explained in Li & Binney (2022); Appendix A gives greater detail. Section 4 specifies the structure of the DF f(J) that we fit to our tracers of the young disc. Section 5.1 outlines the self-consistent Galaxy model and the dust model in which we place models of the young disc. In Section 6, we use mock data to discover how accurately we can determine the structure of the young disc from a sample of OB stars. In Section 7 we fit DFs to our sample of OB stars and conclude that the limited spatial extent of the sample, combined with defects in dust models, severely limit our ability to determine the large-scale structure of the young disc. In Section 7.1, we investigate how well the best-fitting models account for the data. We find that while the models reproduce the observed kinematics well, defects of the dust models prevent the DFs reproducing the distribution of stars on the sky. In Section 7.2, we compare the predictions of our models to 6D data from the LAMOST spectroscopic survey. We confirm the conclusion of Eilers et al. (2020) that stars at |$R\sim 12\, \mathrm{kpc}$| in the anticentre direction are moving systematically outwards as a consequence of large-scale disturbance of the disc. We argue that the orbital velocities of binaries make a significant contribution to the measured line-of-sight velocities of OB stars, which is why several studies have found a puzzling increase with Galactocentric radius in the in-plane velocity dispersion of young stars. Section 8 sums up and identifies useful next steps.

2 A SAMPLE OF OB STARS

We selected OB stars from the intersection of three catalogues: Gaia’s (Gaia Collaboration 2016) early third data release (Gaia Collaboration 2021), the 2MASS infrared catalogue (Skrutskie et al. 2006), and the Starhorse catalogue (Anders et al. 2019, 2021),¹ which provides Bayesian fits of distances and astrophysical parameters to all stars in EDR3 brighter than G = 18.5. EDR3, which was released in December of 2020, contains precision astrometry and photometry for 1.5 billion stars. Parallaxes and proper motions have typical uncertainties of |$0.07\,$|mas and |$0.07\,$|mas|$\, {\rm yr}^{-1}$| at |$G = 17\,$|mag and |$0.5\,$|mas and |$0.5\,$|mas|$\, {\rm yr}^{-1}$| at |$G = 20\,$| mag (Gaia Collaboration 2021). EDR3 gives magnitudes in red and blue passbands G_rp and G_bp in addition to the broad-band G magnitudes.

We start by selecting stars that satisfy three basic criteria (Gaia Collaboration 2018):

$$\begin{eqnarray} &&\varpi /\epsilon _{\varpi }\, \gt \, 5, \\ &&(G_{\rm bp}-G_{\rm rp})_{0}=(G_{\rm bp}-G_{\rm rp})\, -\, E(G_{\rm bp}-G_{\rm rp})\, \lt \, 0, \\ &&M_G=G-5\log (s/10\, \mathrm{pc})-A_{G}\, \lt \, 2, \end{eqnarray}$$

(1)

where A_G and E(G_bp − G_rp) are the star’s extinction and colour from the Starhorse catalogue. In order to avoid contamination by red giants and red clump stars, 2MASS photometry (Skrutskie et al. 2006) is now used to make a further selection. We consider only stars with photometric flag AAA that are blue enough to satisfy

$$\begin{eqnarray} && J-H\lt 0.14(G-K_{s})+0.02, \\ && J-K_{s}\lt 0.23(G-K_{s}). \end{eqnarray}$$

(2)

Then, we exclude A stars by restricting the sample to stars with |$T_{\rm eff}\gt 10\, 000\,$|K in the Starhorse catalogue.

Finally, we exclude stars lying within |$3\, \mathrm{kpc}$| of the Galactic centre to avoid the barred region of the Galaxy. However, on account of the high extinction towards the centre and the density profile of the young disc, few stars are picked even in the range |$R\in (3,5)\, \mathrm{kpc}$|⁠. The final catalogue of OB stars comprises |$46\, 916$| stars. They all have apparent magnitudes brighter than G = 16.65. Fig. 1 shows their spatial distribution.

Figure 1.

The spatial distribution of our sample of OB stars projected on to the xy plane (upper panel) and xz plane (lower panel).

Open in new tab Download slide

2.1 Selecting OB stars from a model

Up to an overall normalization, the DF of the OB stars will differ negligibly from the DF of the young disc. So the sampling algorithm built into agama can be used to pick a sample of OB stars distributed throughout the Galaxy. Absolute magnitudes are assigned to these stars by sampling the luminosity function of OB stars, and using their locations x and a dust model, we compute their apparent magnitudes G. The red curve in Fig. 2 shows the G-band luminosity function that we have used. It was obtained from the PARSEC stellar evolutionary tracks (Bressan et al. 2012),² which yield the number of stars expected in a given range of absolute magnitudes per unit mass of a population with a given star-formation history. We assumed a constant star-formation rate over the last Gyr. Using this procedure we obtained luminosity functions for the G_rp, G_bp, J, H, and K_s bands in addition to the G band.

Figure 2.

The G-band luminosity function for OB stars extracted from the PARSEC stellar evolutionary tracks.

Open in new tab Download slide

We assign each star observational uncertainties based on its apparent magnitude³ and scatter its phase-space coordinates by these errors. Then a mock star enters the catalogue if (cf equation 1)

$$\begin{eqnarray} &&\varpi /\epsilon _{\varpi }\gt 5 \\ && M_G\lt 2 \\ && G\lt 16.65 \\ && R\ge 3\, \mathrm{kpc},\delta \gt -30^{\circ } \end{eqnarray}$$

(3)

These criteria are simple because EDR3 contains essentially all stars brighter than G = 16.65. The restriction on δ arises because we use a dust model (Green 2018; Green et al. 2019), that was developed from photometry taken in Hawii (Kaiser et al. 2010).

3 FORMALISM

Our approach to model fitting is that developed in McMillan & Binney (2012, 2013) and implemented in the case of 5D data by Li & Binney (2022). It is based on an algorithm for determining the likelihood of data given a gravitational potential Φ(x) a DF f(J) and a selection function S(x, M) that gives the probability that a star with absolute magnitude M located at x will enter the catalogue that is being modelled. The likelihoods are used to drive a Markov-chain Monte Carlo (MCMC) exploration of the parameter space of DFs. Two aspects of the present problem require additions to the formulae in Li & Binney (2022): (i) whereas RR-Lyrae stars can be assumed to have a unique absolute magnitude, the absolute magnitudes of our OB stars span a non-negligible range (M⁻, M⁺), and (ii) each star has an extinction A derived from its location x. Appendix A derives the additional formulae.

4 DISTRIBUTION FUNCTION

BV2022 introduced a new family of DFs for disc components.

$$\begin{eqnarray} f({\bf J})=f_\phi (J_\phi)\, f_r(J_\phi ,J_r)\, f_z(J_\phi ,J_z), \end{eqnarray}$$

(4)

where the functions f_r and f_z are

$$\begin{eqnarray} f_i(J_\phi ,J_i)=\bigg (\frac{J_{\phi 0}}{J_{\rm v}}\bigg)^{p_i} \frac{1}{J_{i0}}\, \exp \bigg [-\bigg (\frac{J_{\phi 0}}{J_{\rm v}}\bigg)^{p_{i}}\frac{J_i}{J_{i0}} \bigg ]~~~~(i=r,z). \end{eqnarray}$$

(5)

This definition involves a function

$$\begin{eqnarray} J_{\rm v}\equiv |J_\phi |+J_{\rm v0} \end{eqnarray}$$

(6)

of J_ϕ and six parameters p_r, p_z, J_r0, J_z0, J_ϕ0, J_v0. The parameter p_r determines how rapidly the radial velocity dispersion σ_R declines outwards, while p_z determines how rapidly σ_z declines with distance from the plane. The constants J_r0 and J_z0 set, respectively, the radial and vertical velocity dispersions. Changes to the constant J_v0 modify the central velocity dispersion of the component, which is not of interest for this study.

The form proposed by BV2022 for f_ϕ(J_ϕ) yields a stellar surface density that declines monotonically with radius. Since OB stars have formed recently from molecular gas, and the surface density of gas declines steeply inside the giant molecular ring, i.e. inside |$R\sim 5\, \mathrm{kpc}$|⁠, we multiply the BV2022 form

$$\begin{eqnarray} f_\phi (J_\phi)=\frac{M}{(2\pi)^{3}}\, \frac{J_{\rm d}}{2J_{\phi 0}^2}\, \exp {\bigg (-\frac{J_{\rm d}}{J_{\phi 0}}\bigg)} \end{eqnarray}$$

(7)

of f_ϕ by

$$\begin{eqnarray} 1+\tanh \bigg (\frac{J_{\phi }-|J_{\rm taper}|}{J_{\rm trans}}\bigg), \end{eqnarray}$$

(8)

where J_taper is a parameter that determines the radius at which the surface density peaks and J_trans determines the steepness of the surface density’s decline interior to the peak. In the definition (7) of f_ϕ, M is the disc’s mass, J_ϕ0 sets the disc’s asymptotic scale length, and

$$\begin{eqnarray} J_{\rm d}(J_\phi)=|J_\phi |+J_{\rm d0}, \end{eqnarray}$$

(9)

where J_d0 is a constant. Modifying J_d changes the disc’s central density, which is not of interest here.

We refer readers to BV2022 for more information on the physical significance of the DF’s parameters. Those that are of concern here are p_r, p_z, J_r0, J_z0, J_ϕ0, J_taper, and J_trans. We fix |$J_{\rm v0}=10\, \mathrm{kpc}\, \mathrm{km\, s}^{-1}$| and |$J_{\rm d0}=20\, \mathrm{kpc}\, \mathrm{km\, s}^{-1}$|⁠.

5 POTENTIAL AND DUST MODELS

Given a DF of the form f(J), any observable can be predicted once the Galaxy’s gravitational potential and its distribution of absorbing dust are specified.

5.1 The gravitational potential

The upper panel in Fig. 3 shows the circular speed of the gravitational potential that we have used. The blue dashed line represents all the spheroidal components including dark halo, bulge, and stellar halo. The black dotted line denotes the disc components. This potential was obtained by relaxing to self-consistency an axisymmetric model Galaxy that is defined by distribution functions for dark matter, a bulge, a stellar halo, four superposed stellar discs, and a gas disc. Table 1 lists key characteristics of this model. Further information for the parameters generating this model can be found in the online supplementary material. The lower panel in Fig. 3 shows the vertical density distributions of the components at the solar radius. The dot, dashed, and dash-dotted grey lines represent young, middle-age, and old components of the thin disc, respectively. The solid grey line shows the complete thin disc. The red and magenta lines denote thick disc and stellar halo, respectively. The dark magenta line shows the vertical density of the dark halo.

Figure 3.

The upper panel shows the circular speed v_c of the potential we use. The blue dots are observational estimates obtained by Eilers et al. (2019) from red giant stars. The lower panel shows the vertical density distributions at the solar radius for the components.

Open in new tab Download slide

Table 1.

Open in new tab

The disc mass and densities at solar radius.

M_disc

5.48 × 10¹⁰ M_⊙

ρ(R₀, z = 0)

0.095 M_⊙pc⁻³

ρ(R₀, z = 1)

0.005 M_⊙pc⁻³

5.2 Observational dust model

Several groups have recently developed models of the Galactic distribution of dust (Sale & Magorrian 2014; Lenz, Hensley & Doré 2017; Green 2018; Green et al. 2019). The green curves in Figs 4 and 5 show extinction as a function of distance along eight lines of sight according to the Bayestar2019 model from the most recent of these studies. The left-hand panel of Fig. 4 shows the line of sight (ℓ = 0, b = 10) towards the Galactic centre and that (ℓ = 180, b = 10) towards the anticentre, while the right-hand panel shows the lines of sight vertically downwards and upwards. The upper panels of Fig. 5 are for lines of sight at |$45\,$|deg to the centre and anticentre directions while the bottom left-hand panel is for the direction that in an axisymmetric Galaxy would be equivalent to that of the middle right-hand panel. Actually, the asymptotic extinction is about six times larger at |$\ell =135\,$|deg than at |$\ell =225\,$|deg. In each panel, the cyan line shows the extinction towards extra-Galactic objects from Schlegel et al. (1998). Ideally, each green curve would asymptote at large distances to its cyan line. Unfortunately, the reality falls well short of this ideal with the value from Green et al. (2019) systematically smaller than that from Schlegel et al. (1998).

$Estimates of extinction versus distance (i) in the centre and anticentre directions at $b=10\,$ deg (left-hand panel) and vertically up or down (right-hand panel). The green and orange lines are from the Bayestar2019 (Green et al. 2019) and toy models, respectively. The dotted cyan lines show the asymptotic values from Schlegel, Finkbeiner & Davis (1998).$

Figure 4.

Estimates of extinction versus distance (i) in the centre and anticentre directions at |$b=10\,$| deg (left-hand panel) and vertically up or down (right-hand panel). The green and orange lines are from the Bayestar2019 (Green et al. 2019) and toy models, respectively. The dotted cyan lines show the asymptotic values from Schlegel, Finkbeiner & Davis (1998).

Open in new tab Download slide

$As Fig. 4 for lines of sight at $b=10\,$ deg and various longitudes.$

Figure 5.

As Fig. 4 for lines of sight at |$b=10\,$| deg and various longitudes.

Open in new tab Download slide

5.3 Toy dust model

Since Figs 4 and 5 suggest that even the best current extinction map is likely far from the truth, we have investigated how results obtained from mock data are affected by use of an incorrect dust model. For these tests, we used a toy dust distribution assembled by adding a spiral distribution and a local bubble to an underlying axisymmetric distribution of dust. The latter is

$$\begin{eqnarray} \rho (R)=\rho _0\, \exp {\Big (-\frac{R}{R_s}-\frac{z}{z_s}}\Big), \end{eqnarray}$$

(10)

where |$R_s=3\, \mathrm{kpc}$| and |$z_s=0.1\, \mathrm{kpc}$| are scale length and height of the dust disc, and ρ₀ = 5 is a scale density. We add a four-arm spiral pattern to this axisymmetric density distribution by multiplying it by

$$\begin{eqnarray} s=1+\beta \cos ^{2}{(m h_{s}/2)}, \end{eqnarray}$$

(11)

where β = 6, m = 4 and

$$\begin{eqnarray} h_s=\alpha \ln {R}+(\phi -\phi _0), \end{eqnarray}$$

(12)

with α = 4, ϕ₀ = 135°, and |$R=\sqrt{x^2+y^2}$|⁠.

To simulate the local bubble we further multiply the above density by

$$\begin{eqnarray} b=1-\exp {(-s^2/2\sigma ^2)}, \end{eqnarray}$$

(13)

where s is heliocentric distance and |$\sigma =0.2\, \mathrm{kpc}$| sets the scale of the bubble.

The extinction is then the integral |$A=\int {\rm d}s\, \rho (R,z,\phi)$| of ρ(R, z, ϕ) along the line of sight and we use this extinction to assign apparent magnitudes to mock stars proposed by agama. When analysing the resulting mock catalogue, we use the dust model from Green et al. (2019), which is in this case inaccurate.

The orange curves in Figs 4 and 5 show an example of extinctions from this model alongside the empirical extinctions from Green et al. (2019) and Schlegel et al. (1998). The most serious defect of the toy model is a failure to generate radically different extinctions at the longitudes |$\ell =180\pm 45\,$| deg. This failure is attributable to its spherical local bubble rather than a cavity that is bounded by a very lumpy wall (Zucker et al. 2022). However, even a simplistic toy model of the dust enables us to probe the consequences of fitting data using a defective dust model.

6 TESTS

We generated ten mock catalogues as described in Section 2.1. The parameters used to generate these catalogues are given in the top row of Table 2. Then the computer explored the likelihood in the 7D parameter space with coordinates (J_r0, J_z0, ln J_ϕ0, p_r, p_z, ln J_tapper, ln J_trans). Thus we were holding fixed the parameters J_d0 and J_v0 that only affect a model in the region interior to all the data. In some runs these two parameters took the values used to generate the mock data, while in other runs they were fixed at values that differed from those that generated the data. These tests showed insensitivity of results to the values of these two parameters.

Table 2.

Open in new tab

Summary of test results. Actions are given in |$\, \mathrm{kpc}\, \mathrm{km\, s}^{-1}$|⁠.

Parameters	J_r0	J_z0	ln J_ϕ0	p_r	p_z	ln J_taper	ln J_trans
Input values	20	5	6.91	0.35	0.35	6.91	4.50
Single catalogue	\|$20.21^{+0.65}_{-0.66}$\|	\|$4.97^{+0.09}_{-0.09}$\|	\|$6.89^{+0.02}_{-0.01}$\|	\|$0.27^{+0.05}_{-0.05}$\|	\|$0.30^{+0.02}_{-0.03}$\|	\|$6.90^{+0.01}_{-0.01}$\|	\|$4.53^{+0.10}_{-0.12}$\|
Mean of 10 catalogues	19.07 ± 0.88	5.01 ± 0.10	6.91 ± 0.01	0.31 ± 0.06	0.33 ± 0.03	6.90 ± 0.01	4.42 ± 0.09
Wrong dust model	\|$16.87^{+1.15}_{-1.82}$\|	\|$4.18^{+0.13}_{-0.12}$\|	\|$6.76^{+0.03}_{-0.03}$\|	\|$0.48^{+0.14}_{-0.05}$\|	\|$0.39^{+0.04}_{-0.04}$\|	\|$7.01^{+0.02}_{-0.02}$\|	\|$5.11^{+0.12}_{-0.19}$\|

Parameters	J_r0	J_z0	ln J_ϕ0	p_r	p_z	ln J_taper	ln J_trans
Input values	20	5	6.91	0.35	0.35	6.91	4.50
Single catalogue	\|$20.21^{+0.65}_{-0.66}$\|	\|$4.97^{+0.09}_{-0.09}$\|	\|$6.89^{+0.02}_{-0.01}$\|	\|$0.27^{+0.05}_{-0.05}$\|	\|$0.30^{+0.02}_{-0.03}$\|	\|$6.90^{+0.01}_{-0.01}$\|	\|$4.53^{+0.10}_{-0.12}$\|
Mean of 10 catalogues	19.07 ± 0.88	5.01 ± 0.10	6.91 ± 0.01	0.31 ± 0.06	0.33 ± 0.03	6.90 ± 0.01	4.42 ± 0.09
Wrong dust model	\|$16.87^{+1.15}_{-1.82}$\|	\|$4.18^{+0.13}_{-0.12}$\|	\|$6.76^{+0.03}_{-0.03}$\|	\|$0.48^{+0.14}_{-0.05}$\|	\|$0.39^{+0.04}_{-0.04}$\|	\|$7.01^{+0.02}_{-0.02}$\|	\|$5.11^{+0.12}_{-0.19}$\|

Table 2.

Open in new tab

Summary of test results. Actions are given in |$\, \mathrm{kpc}\, \mathrm{km\, s}^{-1}$|⁠.

Parameters	J_r0	J_z0	ln J_ϕ0	p_r	p_z	ln J_taper	ln J_trans
Input values	20	5	6.91	0.35	0.35	6.91	4.50
Single catalogue	\|$20.21^{+0.65}_{-0.66}$\|	\|$4.97^{+0.09}_{-0.09}$\|	\|$6.89^{+0.02}_{-0.01}$\|	\|$0.27^{+0.05}_{-0.05}$\|	\|$0.30^{+0.02}_{-0.03}$\|	\|$6.90^{+0.01}_{-0.01}$\|	\|$4.53^{+0.10}_{-0.12}$\|
Mean of 10 catalogues	19.07 ± 0.88	5.01 ± 0.10	6.91 ± 0.01	0.31 ± 0.06	0.33 ± 0.03	6.90 ± 0.01	4.42 ± 0.09
Wrong dust model	\|$16.87^{+1.15}_{-1.82}$\|	\|$4.18^{+0.13}_{-0.12}$\|	\|$6.76^{+0.03}_{-0.03}$\|	\|$0.48^{+0.14}_{-0.05}$\|	\|$0.39^{+0.04}_{-0.04}$\|	\|$7.01^{+0.02}_{-0.02}$\|	\|$5.11^{+0.12}_{-0.19}$\|

Parameters	J_r0	J_z0	ln J_ϕ0	p_r	p_z	ln J_taper	ln J_trans
Input values	20	5	6.91	0.35	0.35	6.91	4.50
Single catalogue	\|$20.21^{+0.65}_{-0.66}$\|	\|$4.97^{+0.09}_{-0.09}$\|	\|$6.89^{+0.02}_{-0.01}$\|	\|$0.27^{+0.05}_{-0.05}$\|	\|$0.30^{+0.02}_{-0.03}$\|	\|$6.90^{+0.01}_{-0.01}$\|	\|$4.53^{+0.10}_{-0.12}$\|
Mean of 10 catalogues	19.07 ± 0.88	5.01 ± 0.10	6.91 ± 0.01	0.31 ± 0.06	0.33 ± 0.03	6.90 ± 0.01	4.42 ± 0.09
Wrong dust model	\|$16.87^{+1.15}_{-1.82}$\|	\|$4.18^{+0.13}_{-0.12}$\|	\|$6.76^{+0.03}_{-0.03}$\|	\|$0.48^{+0.14}_{-0.05}$\|	\|$0.39^{+0.04}_{-0.04}$\|	\|$7.01^{+0.02}_{-0.02}$\|	\|$5.11^{+0.12}_{-0.19}$\|

The SLSQP method in the scipy (Virtanen et al. 2020) package was first used to find the maximum of the likelihood and then the emcee package (Foreman-Mackey et al. 2013) was used to run a Markov Chain Monte Carlo exploration starting from the maximum likelihood. We used 35 walkers for each parameter over a total of 500 steps with the first 100 steps treated as burn-in.⁴ Fig. 6 shows a typical set of projections of the posterior probability distribution. The blue lines in each panel indicate the true parameter values while the dashed lines in the histograms show the 1-σ uncertainties from the 16th and 84th percentiles. The second row of Table 2 lists these uncertainties of the posterior distribution yielded by one catalogue, while the table’s third row gives the means and standard deviations of the means of the ten posterior distributions. The standard deviations are close to the 1-σ uncertainties from individual posterior distributions, but in most cases are slightly larger, as would be expected if the distributions had longer tails than a Gaussian.

Figure 6.

Posterior probability distributions from a mock catalogue.

Open in new tab Download slide

Only p_r and p_z have true values that lie outside the 1-σ range inferred from a single catalogue and even these true values lie within the 1-σ ranges inferred from ten catalogues. Thus the MCMC runs are delivering consistent results. Moreover, the precision with which parameters can be recovered is impressive – the uncertainty of the actions other than J_r0 is |$\lesssim 2$| per cent, while that of J_r0 is better than 4 per cent.

6.1 Correlations between parameters

Fig. 6 shows just two significant correlations between parameters, namely between J_r0 and p_r and between J_z0 and p_z. J_r0 sets the magnitude of the radial velocity dispersion σ_R, while p_r sets the radial gradient of σ_R. J_z0 and p_z similarly set the magnitude and radial gradient of σ_z. In the upper panel of Fig. 7 we plot σ_R versus R for three DFs. Also plotted as a chained black line is the radial density of mock stars in the catalogue. We see that very similar values of σ_R are obtained within the well sampled radial range using values of p_r that range from 0 up to 0.6 by compensating for the change in p_r by a change in J_r0. Interior to the well sampled range the three models predict very different values for σ_R, so they could be distinguished if the catalogue contained a significant number of stars at |$R\lt 3\, \mathrm{kpc}$|⁠. The lower panel of Fig. 7 makes the same point in the context of σ_z. Hence the strong degeneracies evident in Fig. 6 reflect the lack of stars at |$R\lesssim 3\, \mathrm{kpc}$|⁠.

Figure 7.

Full lines: the radial and vertical velocity dispersions σ_R and σ_z as functions of radius R in the Galactic plane for several models. Broken lines: the surface density Σ(R) of OB stars – the densities of all models considered are indistinguishable.

Open in new tab Download slide

The uncertainties in J_z0 and p_z are smaller than those in J_r0 and p_r because the greatest differences in R are achieved towards the anticentre, where v_R dominates v_∥, which is not in the catalogue, while v_z dominates v_b, which is in the catalogue.

6.2 Impact of the dust model

Now we explore how using an incorrect dust model affects the posterior distribution of J_ϕ0 and J_taper. Fig. B1 shows the posterior probabilities obtained when a mock catalogue created using the toy dust model of Section 5.3 is analysed using the dust model of Green et al. (2019). The bottom row of Table 2 lists the corresponding statistics.

The formal uncertainties on the parameters, ln J_ϕ0, ln J_taper, and ln J_trans that determine the stellar surface density increase only moderately but the true value of ln J_ϕ0 now lies 0.12 above its 1-σ range, while the true values of ln J_taper and ln J_trans now lie 0.08 and 0.42 below their 1-σ ranges. The lower left-hand panel of Fig. 5 shows that around |$\ell =135\,$|deg, the toy model predicts much lower extinction than the Green et al. (2019) model. Low extinction enhances the number of mock stars, and to account for these stars in the presence of the larger observed extinction, the disc must be imagined more extended than it is. Hence, the recovered value of J_ϕ0 should be larger than the true value. That this is not the case must be attributed to the unrealistically large values of J_taper and J_trans: increasing J_taper pushes outwards the flat part of the surface density, and increasing J_trans extends the flat section.

The formal uncertainties on J_r0,J_z0, p_r, and p_z all increase significantly and the true values of both J_r0 and J_z0 now lie above their 1-σ ranges, while the true values of p_r and p_z still lie within their 1-σ ranges.

This experiment strongly suggests (i) that the formal uncertainties on the recovered parameters of the DF materially underestimate the true uncertainties because the latter are dominated by the uncertain distribution of dust, and (ii) that degeneracies between the parameters make it hard to predict how recovered parameters will be affected by a change in the dust model.

7 FITS TO GAIA DATA

When the code was used for an MCMC search of fits to the real data, the chain showed a long-term drift towards larger J_r0 and smaller J_ϕ0. In light of this drift, the MCMC chain was extended to 1500 steps, three times the number of steps used for the mock data, but without convincing evidence emerging that the chain’s long-term drift had ceased. As discussed below, we think this poor performance of the code on real data is a consequence of our use of an erroneous dust model. On account of the latter, no DF gives a convincing fit to all the data. Different DFs fit different parts of the data and the code wanders from one unsatisfactory DF to another without finding a convincing maximum in the data’s likelihood.

Here we present results from a near-converged section of ∼300 steps. These results illustrate the quality of fit to the data that can be achieved with the present dust model, and provide an indication of the parameter choices that might be found when a satisfactory dust model becomes available. Fig. 8 shows posterior distributions from the chosen section of the chain, while Table 3 gives the means and 1-σ uncertainties of these distributions. It should be borne in mind that the true uncertainties will be significantly larger than those quoted here on account of the premature truncation of the MCMC chain.

Figure 8.

Posterior probability distributions from the real catalogue.

Open in new tab Download slide

Table 3.

Open in new tab

Statistics of the probability distributions from fits to Gaia OB stars. J_d0 and J_v0 are kept fixed at |$20\, \mathrm{km\, s}^{-1}$| and |$10\, \mathrm{km\, s}^{-1}$|⁠, respectively.

Parameters	J_r0	J_z0	ln J_ϕ0	p_r	p_z	ln J_taper	ln J_trans	J_d0	J_v0
Fitted result	\|$47.09^{+0.77}_{-0.87}$\|	\|$5.90^{+0.03}_{-0.03}$\|	\|$6.77^{+0.01}_{-0.01}$\|	\|$-1.48^{+0.02}_{-0.02}$\|	\|$-1.40^{+0.03}_{-0.02}$\|	\|$7.05^{+0.03}_{-0.01}$\|	\|$6.74^{+0.02}_{-0.01}$\|	20	10

Parameters	J_r0	J_z0	ln J_ϕ0	p_r	p_z	ln J_taper	ln J_trans	J_d0	J_v0
Fitted result	\|$47.09^{+0.77}_{-0.87}$\|	\|$5.90^{+0.03}_{-0.03}$\|	\|$6.77^{+0.01}_{-0.01}$\|	\|$-1.48^{+0.02}_{-0.02}$\|	\|$-1.40^{+0.03}_{-0.02}$\|	\|$7.05^{+0.03}_{-0.01}$\|	\|$6.74^{+0.02}_{-0.01}$\|	20	10

Table 3.

Open in new tab

Statistics of the probability distributions from fits to Gaia OB stars. J_d0 and J_v0 are kept fixed at |$20\, \mathrm{km\, s}^{-1}$| and |$10\, \mathrm{km\, s}^{-1}$|⁠, respectively.

Parameters	J_r0	J_z0	ln J_ϕ0	p_r	p_z	ln J_taper	ln J_trans	J_d0	J_v0
Fitted result	\|$47.09^{+0.77}_{-0.87}$\|	\|$5.90^{+0.03}_{-0.03}$\|	\|$6.77^{+0.01}_{-0.01}$\|	\|$-1.48^{+0.02}_{-0.02}$\|	\|$-1.40^{+0.03}_{-0.02}$\|	\|$7.05^{+0.03}_{-0.01}$\|	\|$6.74^{+0.02}_{-0.01}$\|	20	10

Parameters	J_r0	J_z0	ln J_ϕ0	p_r	p_z	ln J_taper	ln J_trans	J_d0	J_v0
Fitted result	\|$47.09^{+0.77}_{-0.87}$\|	\|$5.90^{+0.03}_{-0.03}$\|	\|$6.77^{+0.01}_{-0.01}$\|	\|$-1.48^{+0.02}_{-0.02}$\|	\|$-1.40^{+0.03}_{-0.02}$\|	\|$7.05^{+0.03}_{-0.01}$\|	\|$6.74^{+0.02}_{-0.01}$\|	20	10

In Fig. 8, the histograms are less Gaussian than the corresponding histograms in Fig. 6, and in the off-diagonal panels the regions of high likelihood are typically much more elongated than the corresponding regions. We attribute both phenomena to the failure of the chain to converge on account of the poor dust model.

7.1 Fit quality

A natural first question is whether the models with the largest recovered likelihoods give an acceptable account of the data. Figs 9 and 10 are histograms of velocity in the b and ℓ directions, respectively for stars binned by radius. The red histograms are for real stars and the blue histograms are for a sample of mock stars drawn from the model that gives the real data the highest likelihood. While the red and blue histograms agree moderately well in Fig. 9 for v_b, they are disturbingly different in Fig. 10 for v_ℓ. Figs 11 and 12 shed light on these differences by showing analogous histograms when we use the same model to choose a velocity for each real star. Now the real and mock distributions agree at least as well in v_ℓ as in v_b. Evidently, the clash in Fig. 10 arises because the real and mock stars have systematically different locations; at a given place, the model predicts velocities accurately, but it makes poor choices for the locations of stars.

Figure 9.

Histograms of v_b for real data (red) and for mock stars drawn from the best-fitting model. For each bin we give the median R value of the mock stars.

Open in new tab Download slide

Figure 10.

As Fig. 9 but for v_ℓ.

Open in new tab Download slide

Figure 11.

As Fig. 9 except that the blue histogram is computed from velocities assigned by sampling the best-fitting model at the locations of real rather than mock stars.

Open in new tab Download slide

Figure 12.

As Fig. 11 for the v_ℓ component.

Open in new tab Download slide

Fig. 13 drives this point home by showing the real and mock distributions of stars in the Galactocentric angle ϕ. In general, the real distribution is narrower and shifted to larger ϕ than the mock distribution. Since differential rotation of the disc makes v_ℓ a strong function of ϕ, different distributions in ϕ inevitably leads to a different distributions in v_ℓ.

Figure 13.

The distribution of ϕ for both best-fitting mock and real catalogue stars. The bins are the same as for Figs 10 and 9.

Open in new tab Download slide

The prime suspect for causing these differences in the distribution of ϕ between real and mock stars must be the dust model. Indeed, our axisymmetric DF has very little capacity to modify the distribution in ϕ. Figs 4 and 5 by contrast demonstrate that dust violently breaks the symmetry in ϕ. If the dust model breaks this symmetry incorrectly, the mock stars will not reproduce the observed bias in ϕ, and the predicted values of v_ℓ will be wrongly distributed.

Of course dust also has a strong influence on the radial distribution of catalogued stars. The red and blue histograms in Fig. 14 show the real and mock distributions in R. The mock distribution is broader than the real one because it extends to smaller radii. This finding suggests that the extinctions provided by the dust model tend to be too small in directions towards the Galactic Centre.

Figure 14.

The distribution in R of the real stars (red) and mock stars (blue) drawn from the best-fitting model.

Open in new tab Download slide

Changing the radial distribution of OB stars is very much within the scope of the DF, so it is perhaps puzzling that the best-fitting model does not reproduce the observed distribution in R better. The answer may be that when the DF changes the radial distribution of stars, for example by changing the value of J_taper, it will also change the kinematics. So the need to fit the observed kinematics can push the DF to predict (correctly) an abundance of stars at small radii that do not feature in the data, but do in a mock catalogue drawn using extinctions that are too small.

The curve Fig. 15 shows the surface density of the best-fitting model. The density peaks at |$R\sim 5.5\, \mathrm{kpc}$|⁠, which is now thought to be close to the bar’s corotation radius (Pérez-Villegas et al. 2017; Binney 2020; Chiba, Friske & Schönrich 2021). The density drops faster from the peak inwards than outwards. The blue dots in Fig. 15 show the surface density of the young disc estimated by Xiang et al. (2018) using a catalogue from the LAMOST survey. They found a significant peak at the solar radius which they suggests reflects the Local Arm but they were unable to probe the disc interior to the Sun on account of LAMOST’s bias towards the anticentre. Bovy et al. (2016) used red clump stars in APOGEE to assess the structure of mono-abundance populations (MAPs). The metal-rich, low-α population to which the OB stars belong, was found to have a ‘broken’ exponential radial profile, with the break at |$R=7\, \mathrm{kpc}$|⁠. Mackereth et al. (2017), using red giants in APOGEE as the tracers, found that the youngest population has a break radius at |$R\simeq 8\, \mathrm{kpc}$|⁠. The surface density of our best-fitting model shown in Fig. 15 is broadly in line with these earlier results, although its break radius is somewhat smaller.

Figure 15.

The curve shows the surface density of the best-fitting model of the young disc. The blue data points show the surface densities derived by Xiang et al. (2018) from LAMOST data. The normalization of the model is reduced by a factor 3 to facilitate the comparison of data and model.

Open in new tab Download slide

The predicted peak in the surface density of OB stars at |$R\simeq 5.5\, \mathrm{kpc}$| falls outside the reported peak at |$R=4-5\, \mathrm{kpc}$| in the surface density of H₂ (Heyer & Dame 2015). While studies of the stellar disc are liable to overestimate the radius of the peak stellar density because the innermost young disc is hidden behind the abundant dust near the plane at |$R\lesssim 6\, \mathrm{kpc}$|⁠, the radius at which the density of H₂ peaks may have been underestimated. Indeed, hydrodynamical models (e.g. Sormani, Binney & Magorrian 2015) predict the density of gas to be low inside the bar’s corotation radius, which is currently believed to be as large as |$6\, \mathrm{kpc}$| (Pérez-Villegas et al. 2017; Binney 2020; Chiba et al. 2021). A circular-speed curve is required to convert an observed distribution of emission-line intensity in longitude and velocity into a plot of surface density versus radius. Perhaps when the emission-line data are re-analysed using a circular-speed curve extracted from Gaia, the H₂ distribution will be found to peak at a larger radius.

7.2 Comparison of the LAMOST

Our model predicts distributions of space velocities and in this section we compare these predictions with distributions inferred by combining spectroscopic measurements of v_∥ with EDR3 astrometry. Specifically, many of the OB stars we have identified in the intersection of EDR3, 2MASS, and the Starhorse catalogue were observed by LAMOST (Xiang et al. 2021) so have measured values of v_∥. We select stars within |$\left| b\right| \lt 5\,$| deg and |$\epsilon _v\lt 50\, \mathrm{km\, s}^{-1}$| to exclude stars with poorly determined v_∥. We also shift the LAMOST values of v_∥ by |$4.54\, \mathrm{km\, s}^{-1}$| as suggested by Schönrich & Aumer (2017), Anguiano et al. (2018). After such corrections, LAMOST’s values of v_∥ agree well with the values measured for the same stars by APOGEE and Galah.

Fig. 16 shows the distributions of the corrected values of v_∥ when the stars are divided into four bins in R with the distributions of v_∥ obtained by sampling the best-fitting model at the locations of observed stars. The observed and mock distributions have similar shapes but are offset from one another by an amount that grows with R − R₀. In principle, these offsets could arise through biased distances for the stars causing Galactic rotation to make erroneous contributions to the mock values of v_∥. We found that systematically increasing distances by 10 per cent shifted even the mock histogram for the furthest bin by only |$\sim 1.5\, \mathrm{km\, s}^{-1}$|⁠, so neither erroneous distances nor faulty data reduction can explain the measured offsets.

Figure 16.

Observed and mock histograms of v_∥ in four bins in R.

Open in new tab Download slide

Fig. 17 compares observed and mock distributions of v_R at the locations of LAMOST stars. This comparison is more sensitive to adopted distances than the above comparison of v_∥ values because v_R has contributions from v_ℓ and v_b, which are proportional to distance. These contributions are small, however, because all stars lie near the anticentre. The mock distributions from our axisymmetric model are inevitably symmetric in v_R, so the comparison highlights the asymmetry in the observed distributions. The asymmetry is negligible for the nearest sample because the U component of the solar velocity is chosen to eliminate this asymmetry in the wider populations of disc stars. The observed systematic increase in asymmetry with R − R₀ must arise from some combination of spiral structure and large-scale distortion of the disc by the Sgr Dwarf galaxy, which has been extensively discussed in connection with the Galactic warp and the phase spiral (Jiang & Binney 2000; Antoja et al. 2018; Binney & Schönrich 2018; Bland-Hawthorn et al. 2019; Laporte et al. 2019). Fig. 17 is consistent with what one would expect from the map ov v_R deduced by Eilers et al. (2020) from data for red giant stars.

$Orange histograms: the distributions of vR predicted by the best-fitting model in the plane at $R=8.75,\, 9.25,\, 9.75,$ and $10.25\, \mathrm{kpc}$. Grey histograms: distributions of vR for the OB stars in the LAMOST catalogue that lie in these spatial bins. The number of stars in each bin is 2097, 3572, 4268, and 3215.$

Figure 17.

Orange histograms: the distributions of v_R predicted by the best-fitting model in the plane at |$R=8.75,\, 9.25,\, 9.75,$| and |$10.25\, \mathrm{kpc}$|⁠. Grey histograms: distributions of v_R for the OB stars in the LAMOST catalogue that lie in these spatial bins. The number of stars in each bin is 2097, 3572, 4268, and 3215.

Open in new tab Download slide

Fig. 18 compares our model’s predictions (curves) for the velocity dispersions σ_z and σ_R with results from OB stars in LAMOST. The data for σ_z (red line and points) are in reasonable concordance, although the LAMOST data show a weaker outward decline than the model predicts. Fig. 19 provides evidence that the steeper gradient of the model is required by the EDR3 data by comparing the model prediction (dashed line) with the dispersion in v_b, which for these low-latitude stars differs little from v_z.

Figure 18.

The curves show the profiles of the velocity dispersions σ_z (red) and σ_R (black) predicted by the best-fitting model. The dots show values derived from OB stars in LAMOST.

Open in new tab Download slide

$The curve shows σz as a function of R in the Galactic plane from the best-fitting model. The dots show vb for real stars in Gaia EDR3 with $|b|\lt 5\,$ deg.$

Figure 19.

The curve shows σ_z as a function of R in the Galactic plane from the best-fitting model. The dots show v_b for real stars in Gaia EDR3 with |$|b|\lt 5\,$| deg.

Open in new tab Download slide

In Fig. 18, the black points and full curve for σ_R are starkly incompatible. This conflict signals that the v_R distribution has been broadened as well as shifted to negative values. The points in the upper panel of Fig. 20 show the ratio σ_R/σ_ϕ as a function of R for LAMOST stars, while the black line shows the prediction of the best-fitting model. The data points climb away from the model’s line as R − R₀ grows. A classical result of stellar dynamics is that σ_R/σ_ϕ is largely set by the shape of the circular-speed curve: when the latter is flat, any thin-disc population will have σ_R/σ_ϕ ≃ √2 (e.g. Binney & Tremaine 2008). The LAMOST data imply values of σ_R/σ_ϕ that reach well above 2. No equilibrium model could match these values.

Figure 20.

σ_R/σ_ϕ and σ_z/σ_ϕ plotted against R for OB stars in LAMOST using Starhorse distances.

Open in new tab Download slide

7.3 Impact of binaries

We now ask whether binary stars can explain the anomalously large values of σ_R/σ_ϕ plotted in Fig. 20.

Massive stars usually have a massive binary companion, but we find that only ∼10 per cent of the OB stars in our sample have binary companions resolved by Gaia. Taking the angular resolution of Gaia to be |$0.1\,$|arcsec, it follows that the majority of these stars must be in binaries with separations

$$\begin{eqnarray} r\lt 0.2{s\over \hbox{pc}}\, \hbox{AU}. \end{eqnarray}$$

(14)

Consequently, their orbital velocities satisfy

$$\begin{eqnarray} v_{\rm b}^2\gt 45\bigg ({M_1+M_2\over 10\, {\rm M}_\odot }{\hbox{kpc}\over s}\bigg)\left(\!\, \mathrm{km\, s}^{-1} \right)^2. \end{eqnarray}$$

(15)

The least massive B star has |$M\sim 3.8\, {\rm M}_\odot$| (Binney & Merrifield 1998), so binary velocities in excess of |$10\, \mathrm{km\, s}^{-1}$| must be common.

The velocities of binaries will be isotropically distributed, so the mean-square value of the component along the line of sight is |$v_{\parallel \rm b}^2=v_{\rm b}^2/3$|⁠. The mean-square velocity of the primary is |$M_2^2/(M_1+M_2)^2$| times this, so smaller by a factor of at least 4. If for simplicity we assume that light from the primary dominates the spectrum, the contribution to the measured value of |$v_\parallel ^2$| is

$$\begin{eqnarray} v_{\parallel 1}^2&=&{\textstyle {1\over 3}}{GM_\odot \over 1\, \hbox{AU}} \bigg \langle {1\, \hbox{AU}\over r}\bigg \rangle {M_1+M_2\over \, {\rm M}_\odot }\Big ({M_2\over M_1+M_2}\Big)^2 \\ && \simeq 300\bigg \langle {1\, \hbox{AU}\over r}\bigg \rangle {M_2\over M_1+M_2}{M_2\over M_\odot }(\, \mathrm{km\, s}^{-1})^2 \end{eqnarray}$$

(16)

The distribution of binary separations determined by Duchêne & Kraus (2013) and Gravity Collaboration (2018) in a study of the Orion nebula yields |$\big \langle 1/r\big \rangle \simeq 5\, \hbox{AU}^{-1}$|⁠, so

$$\begin{eqnarray} v_{\parallel 1}^2\simeq (39)^2{M_2\over M_1+M_2}{M_2\over M_\odot }\left(\, \mathrm{km\, s}^{-1} \right)^2 \end{eqnarray}$$

(17)

Given that the product of masses in this equation is of the order of unity and that σ_R is dominated by v_∥, it is very much to be expected that binaries will cause the measured value of σ_R/σ_ϕ to be substantially higher than expected in a model that ignores binaries. Fig. 18 suggests that binaries set a floor value, |$\sigma _R\simeq 20\, \mathrm{km\, s}^{-1}$|⁠, to σ_R, which is smaller by a factor ∼2 than the value suggested by the calculation above. Our value of 〈1/r〉 may well be too large because it is dominated by the small fraction of very close binaries, and is correspondingly uncertain.

8 CONCLUSIONS

We extracted a sample of |$\sim 47\, 000$| OB stars from the intersection of Gaia EDR3 with the 2MASS and Pan-STARRS surveys. For this sample, we have photometry plus 5D astrometry. We used the algorithm developed in Li & Binney (2022) in connection with similar data for a sample of RR-Lyrae stars to fit DFs of the form f(J) to the OB population. Since the surface density of the young disc is expected to peak at a radius of a few kpc, we introduced two new parameters to the disc DFs proposed by BV2022. Small extensions of the algorithm were required to deal with (i) the finite spread in luminosity of the sampled stars, and (ii) extinction by dust. Tests of the algorithm on similar pseudo-data showed that the parameters of the true DF can be recovered with remarkable precision when the distribution of dust is accurately known. When a significantly incorrect dust distribution is used to analyse the data, the true parameters can lie several sigmas above or below the probable range returned by the algorithm.

When the algorithm is run on the real data using the dust model of Green et al. (2019), the most probable DF yields pseudo-data that are not correctly distributed on the sky. The source of this discrepancy is almost certainly imperfections of the dust model used. Since the distribution of velocities depends quite strongly on location, discrepancies in the spatial distribution of stars automatically leads to discrepancies between the predicted and measured velocity distributions. When the most probable DF is used to sample velocities at the observed locations of stars rather than at the locations of pseudo-stars, the observed and predicted velocity distributions agree well. These tests show that a sample of OB stars like the present one would pin down the phase-space structure of the young stellar disc with impressive accuracy if the 3D distribution of dust were accurately known. The discrepancies between the distributions of pseudo-stars and real stars constitute clear evidence that the best current dust model is significantly flawed.

Although the functional form of the DF provides great flexibility in the radial distribution of stars, the best-fitting DF predicts a distribution of OB stars that extends to smaller radii than the observed sample. Since the kinematics of a stellar disc are not independent of its surface-density profile, the MCMC search may favour discs that extend unexpectedly far in because the kinematics of such discs may fit the data better. Moreover, a dust model that included more dust interior to the Sun would make the data consistent with more extended discs. Hence, we consider it likely that there is more dust at |$R\lesssim 7\, \mathrm{kpc}$| than the Bayestar2019 model envisages.

The most probable DF predicts the distributions of the space velocities of OB stars. We have compared these predictions with the distributions of the sub-sample of EDR3 OB stars that have measured line-of-sight velocities because they fell within the LAMOST survey. The predicted and measured dispersions σ_z agree fairly well, although the measured values of σ_z are flat beyond |$R\sim 10\, \mathrm{kpc}$| while the predicted values decrease monotonically outwards. The measured values of σ_R do not decrease outwards as the model predicts – they even increase slightly. Moreover, the median value of v_R drifts upwards from near zero at R₀ to |$\overline{v}_R\sim 10\, \mathrm{km\, s}^{-1}$| at |$R\sim 12\, \mathrm{kpc}$|⁠. We concluded that this trend in |$\overline{v}_R$| is not an artefact induced by erroneous distances. In particular, it is associated with similar offsets between the measured line-of-sight velocities and those predicted by the model. It seems that some combination of spiral structure and disturbance by the gravitational field of the Sgr dwarf galaxy is changing velocities by |$\gtrsim 10\, \mathrm{km\, s}^{-1}$| between here and |$R\sim 12\, \mathrm{kpc}$|⁠.

The ratio σ_R/σ_ϕ, which is strongly constrained by the shape of the circular-speed curve, is unexpectedly large for the LAMOST stars. We argued that this result arises naturally from the majority of the stars being in binaries too tight for Gaia to resolve. The line-of-sight velocity of such a star will differ from its barycentric velocity by a fraction of the binary velocity because the LAMOST data for most stars are based on a single epoch. The EDR3 proper motion, by contrast, is obtained by fitting a curve through the star’s observed positions at ∼40 epochs, so will be barely affected by their binary nature. The line-of-sight velocity of distant LAMOST OB stars feeds strongly into v_R rather than v_ϕ because they are located close towards the anticentre. Hence binaries will push σ_R/σ_ϕ above the value predicted for single stars.

The key to improving our understanding of the Galaxy’s young disc is construction of a better map of the distribution of dust. Traditionally dust is mapped by measuring extinctions for stars with known distances and the advent of vast number of precise parallaxes for stars within a few kiloparsecs of the Sun has breathed new life into this field (Bailer-Jones 2011; Sale & Magorrian 2014; Green et al. 2019; Lallement et al. 2022). The major problems now are (i) the determination of extinctions for large numbers of stars, and (ii) synthesizing the extinction measures into a coherent picture of the dust distribution. This work suggests an approach to dust mapping that dispenses with extinctions measures of individual stars. The stars of a stellar populations that is more than |$\sim 200\, \mathrm{Myr}$| old have to be fairly smoothly distributed in phase space. Moreover, the population’s velocity distribution at one location strongly constrains the population’s DF, and therefore its spatial distribution. Moreover, the velocity distribution at a location can be determined without knowledge of the column of dust through which we see that location because obscuration is velocity-independent. Hence, an exercise along the present lines could yield fairly firm predictions for the spatial distribution of stars, and a dust model could be strongly constrained by comparing this predicted distribution in (ℓ, b, ϖ, μ_ℓ, μ_b, G, colours) with the observed star density.

A great advantage of such an approach is that it could exploit all the |$\gt 1.3\,$| billion stars tracked by Gaia, and thus produce dust maps of unprecedented spatial resolution. To implement this idea, it is necessary both to confront the computational challenge of simultaneously exploring high-dimensional parameter spaces of DFs and dust models, and to obtain a good model of Gaia’s selection function. There is an excellent prospect that implementation will soon be feasible.

SUPPORTING INFORMATION

src.zip

Please note: Oxford University Press is not responsible for the content or functionality of any supporting materials supplied by the authors. Any queries (other than missing material) should be directed to the corresponding author for the article.

ACKNOWLEDGEMENTS

We thank the anonymous referee for the suggestions to improve this paper. CL and JB are supported by the UK Science and Technology Facilities Council under grant number ST/N000919/1. JB also acknowledges support from the Leverhulme Trust through an Emeritus Fellowship.

This work presents results from the European Space Agency (ESA) space mission Gaia. Gaia data are being processed by the Gaia Data Processing and Analysis Consortium (DPAC). Funding for the DPAC is provided by national institutions, in particular the institutions participating in the Gaia MultiLateral Agreement (MLA). The Gaia mission website is https://www.cosmos.esa.int/gaia. The Gaia archive website is https://archives.esac.esa.int/gaia.

This publication makes use of data products from the Two Micron All Sky Survey, which is a joint project of the University of Massachusetts and the Infrared Processing and Analysis Center/California Institute of Technology, funded by the National Aeronautics and Space Administration and the National Science Foundation.

DATA AVAILABILITY

The agama source codes and model parameter file are available in the online supplementary material. The data underlying this article will be shared on reasonable request to the corresponding author.

Footnotes

Gaia@AIP Services at https://gaia.aip.de/

http://stev.oapd.inaf.it/cmd

In fact, this is not true especially for the proper motions because of the different celestial frames of Gaia bright and faint stars, respectively (Cantat-Gaudin & Brandt 2021). However, since the Astrometry Spread Function module in Gaia-verse package (Everall et al. 2021) only works for GaiaDR2 now, the simulated astrometric solutions are deviated from those in Gaia EDR3. As a result, we use some simple randomized errors based on apparent magnitudes instead.

The convergence was tested by computing the autocorrelation time recommended by emcee. The average autocorrelation time for all the parameters are about 25 which means the burn-in steps are reasonable. We also test a longer chain with N = 3000 which is more than 100 times the autocorrelation time. It yielded similar results to those obtained with N = 500 and in the interests of economy we use N = 500 here.

https://github.com/nschloe/quadpy

The overall count of 77 × 35 = 2695 evaluations of the DF for each star.

REFERENCES

Anders

et al. ,

2019

A&A

628

A94

Month:	Total Views:
July 2022	14
August 2022	6
September 2022	53
October 2022	88
November 2022	29
December 2022	27
January 2023	29
February 2023	33
March 2023	28
April 2023	17
May 2023	16
June 2023	14
July 2023	19
August 2023	25
September 2023	14
October 2023	11
November 2023	12
December 2023	15
January 2024	14
February 2024	10
March 2024	13
April 2024	11
May 2024	16
June 2024	26
July 2024	28
August 2024	15
September 2024	14
October 2024	16
November 2024	16
December 2024	10
January 2025	11
February 2025	8
March 2025	17
April 2025	10
May 2025	4

Article Contents

Our Galaxy’s youngest disc Open Access

ABSTRACT

1 INTRODUCTION

2 A SAMPLE OF OB STARS

2.1 Selecting OB stars from a model

3 FORMALISM

4 DISTRIBUTION FUNCTION

5 POTENTIAL AND DUST MODELS

5.1 The gravitational potential

5.2 Observational dust model

5.3 Toy dust model

6 TESTS

6.1 Correlations between parameters

6.2 Impact of the dust model

7 FITS TO GAIA DATA

7.1 Fit quality

7.2 Comparison of the LAMOST

7.3 Impact of binaries

8 CONCLUSIONS

SUPPORTING INFORMATION

ACKNOWLEDGEMENTS

DATA AVAILABILITY

Footnotes

REFERENCES

APPENDIX A: FORMULAE FOR COMPUTING LIKELIHOODS

A1 Evaluating the probabilities

APPENDIX B: TEST RESULT FOR WRONG DUST MODEL

Supplementary data

Citations

Views

Altmetric

Email alerts

Astrophysics Data System

Citing articles via

Latest

Most Read

Most Cited

This Feature Is Available To Subscribers Only

Our Galaxy’s youngest disc