Galaxies at high redshift are very distant galaxies and, since light propagates through space at a finite speed of approximately 300,000 km s−1, they appear to an observer on the Earth as they were in a very remote past, when the light departed them, carrying information on their properties at that time. Observations of objects with very high redshifts play a central role in cosmology because they provide insight into the epochs and the mechanisms of galaxy formation, if one can reach redshifts that are high enough to correspond to the cosmic epochs when galaxies were forming their first populations of stars and began to shine light throughout space.

One fundamental prediction of the theory of the big bang, which has found empirical confirmation in the discovery of the cosmic background radiation, is that early in its evolution the universe consisted only of matter and radiation coupled in thermodynamical equilibrium and homogeneously distributed in space. No galaxies or stars or any other structure could exist in such physical conditions, except for minuscule primordial fluctuations of density, superimposed on the otherwise extraordinarily smooth distribution of matter and energy by quantum physical processes during the first instant of existence of the universe. Today, the universe is highly inhomogeneous, with the matter organized in a hierarchy of structures such as stars, galaxies, clusters and superclusters of galaxies. Understanding the mechanisms that led to the transition from the homogeneous early universe to the structured one observed at the present epoch is of central importance to cosmology and fundamental physics. Yet, they remain poorly understood and still defeat empirical investigation.

Our current ideas attribute the formation of the cosmic structures to the action of gravity, which amplified the primordial density fluctuations until they collapsed under their own self-gravity and became bounded structures. According to this paradigm, the epoch of galaxy formation and the properties of the nascent galaxies were decided by the properties and the relative abundances of the matter and the energy that permeated the universe shortly after the big bang. Galaxies are the building blocks of the distribution of matter in space, and an important step to decipher the history of the cosmic evolution is to identify the first galaxies that formed in the universe and study their properties at the time of their formation. The search for ‘primeval galaxies’ has traditionally played a central role in observational cosmology for more than three decades and, even after the spectacular accomplishments of the last 5 yr or so, is still driving much of extragalactic research.

One of the most important scientific discoveries of this century is the expansion of the universe. This makes the galaxies and the other larger structures that populate the cosmic space recede away from each other with a velocity that is progressively higher for objects that are separated in space by larger distances. If the recession velocity between two objects is small compared with the speed of light, its value is directly proportional to the distance between them, namely
vr = H0d
where the constant of proportionality H0 is called the ‘Hubble Constant’. For larger recession velocities this relation is replaced by a more general one calculated from the theory of general relativity. In each cases, the value of H0 provides the recession velocity of a pair of galaxies separated by unitary distance, and hence sets the rate of the expansion. Since if the universe has been expanding at a higher rate it has taken a shorter time to go from the size at the big bang (close to a mathematical point) to the current size, a higher value of the Hubble constant also means a younger universe. Recent measurements of the Hubble constant place its value in the range between 60 and 70 km s−1 Mpc−1 (1 Mpc = 3.1 × 10exp22 m). This corresponds to an age of the universe between 10 and 15 Gyr (1 Gyr = 10exp9 yr).

A result of the cosmic expansion and of the finite speed of light is that the light emitted from any given galaxy is observed by a remote observer (i.e. one located on a different galaxy, for example the Milky Way) as having a longer wavelength than it had at the emission. This is because the cosmic expansion has caused the space between the two galaxies to increase during the lapse of time between emission and detection, and while the light was traveling from one galaxy to the other. This causes a ‘stretching’ of the wavelength of the light, namely a shift towards longer (therefore redder) wavelengths. The fractional change in the wavelength with respect to the one at emission as measured by the observer is called the ‘redshift’. It follows that, the higher the redshift, the longer the stretch in wavelength, and hence the more distant the two galaxies.

The redshift provides a measure of the distance to a galaxy as well as of the lapse of time between the instant of emission and that of observation. Thus, an observer can see the galaxy and study its properties as it was at the epoch of the emission. Galaxies located near ours have very small redshifts and are seen as they were a very short time ago compared with the age of the universe. In practice, local galaxies are contemporaries of our own, and hence the present cosmic time corresponds to redshift zero. The value of the redshift of objects seen at the time of the big bang is infinity. Figure 1 shows the relationship between redshift, distance and the corresponding age of the universe (i.e. the lapse of time from the big bang to that redshift) for two different assumptions for the cosmological parameters Q and Q1, namely the total density of matter and the energy density due to the cosmological constant, respectively. In both cases the Hubble constant has been assumed to be 65 km s−1 Mpc−1.

In practice, redshifts are measured from the spectra of the galaxies, namely diagrams that plot the intensity of is assumed to be H0 = 65 km s−1 Mpc−1 in both cases. In one model only matter is present in space, namely Q= 1.0 (continuous curves); in the other model the space is filled with both matter and the cosmological constant, with Q= 0.3 and Q1 = 0.7 (dashed curves). The thin curves represent the cosmic age at the various redshifts. The cosmic age at the present epoch (redshift zero), is 10 and 14.5 billion years in the two models, respectively. At the time of the big bang (infinite redshift) the cosmic age was zero. Within the accuracy of the plot, redshift z= 10 can already be identified as infinite. The thick curves represent the distance of a source as a function of its redshift, expressed as ×1000 Mpc. As a comparison, the Andromeda galaxy is at about 0.8 Mpc from the Milky Way and the Virgo cluster at about 18.5 Mpc. The most distant galaxies observed so far are located at a redshift z∼ 5.5. a galaxy’s light (also referred to as flux) as a function of its wavelength, by comparing the observed wavelengths of atomic and molecular features (either emission or absorption lines) with those of the same features observed in laboratories on Earth (i.e. at redshift zero). The top panel of figure 2 shows the model ultraviolet (UV) spectrum of a star-forming galaxy at redshift z= 3. Absorption lines can be observed as deep ‘dents’ in the spectrum. Another pronounced feature visible in the figure is the sudden dimming of the light at wavelengths shorter than the ‘Lyman limit’, as we shall discuss later.

Because of their extreme distances, galaxies at high redshift necessarily appear to an observer as faint objects, even if their absolute luminosity is large. However, only a very small fraction of all the faint galaxies observed in a deep image of the sky are located at high redshifts. This is because galaxies have absolute luminosities that cover a wide range, extending over several orders of magnitude, and fainter galaxies are also much more abundant in space than brighter ones. As a result, deep images of the sky are crowded by a myriad of relatively close but intrinsically faint galaxies, while intrinsically luminous galaxies at high redshift are comparatively very rare. This ‘contamination’ by interlopers is in fact so severe that, without some criterion to cull them from the faint, nearby ones, it would be totally impractical to search for highredshift galaxies by randomly measuring the redshifts of samples of faint galaxies until the very distant ones are found. Because of this elusiveness, high-redshift galaxies came to be regarded as the ‘holy grail’ of cosmological research.

Theoretical expectations of the mechanism of formation of the so-called spheroids, namely elliptical galaxies and bulges of spiral galaxies, provided a criterion to identify high-redshift galaxies. The stars that constitute the spheroids are among the oldest observed in the local universe, and currently there is no appreciable ongoing activity of star formation in these systems. These facts indicate that the spheroids formed their stars during a very early cosmic epoch in a relatively short period of time. Since this epoch marked the first significant episodes of star formation during the cosmic evolution (note that about ∼1/2 of all the stars of the present-day universe are found within spheroids), it has been traditionally identified with the epoch of galaxy formation, and the nascent spheroids themselves have been identified with the primeval galaxies.

The theoretical expectations predicted that the entire stellar content of a spheroid formed during the gravitational collapse of the proto-cloud of gas from which the structure originated. Simple physical arguments show that the duration of such a collapse is of the order of the time of free fall which, for a galaxy with the mass of the Milky Way, is about one hundred million years. If a whole galaxy’s worth of stars is to be formed in such a relatively short time, the rate of formation of the stars must have reached very high values, a condition known as a starburst. For example, a galaxy like the Milky Way has a stellar mass of about 10exp11Mo 1, and if this assembled during a burst that lasted ∼10exp8 yr, the star formation rate must have been ∼10exp3Mo yr−1.Asa comparison, the star formation rate of the Milky way today is about 1Mo yr−1. A galaxy undergoing its first starburst is expected to emit intense UV radiation including, in particular, a very strong emission line at the wavelength λα = 1216 Å called Lyman-α. This feature is produced by the recombination of hydrogen atoms, a gas which is very abundant in young, star-forming galaxies, photoionized by the intense UV light of young, massive stars.

Estimates of the age of the stellar populations of the spheroids place the epoch of the bursts in the redshift range 2 >z>7. The UV spectrum of such galaxies would then be observed redshifted to optical and near-infrared wavelengths and be detectable by ground-based telescopes and electronic detectors. For example, the Lyman-α of a nascent spheroid at redshift z = 3.5 today is observed at λobs = (1+ z)λα ∼ 5500 Å, which is in the middle of the visible band. If at redshift z ∼ 7 or larger the Lyman-α line is redshifted into the infrared portion of the spectrum.

Large observational campaigns to identify primeval galaxies by means of their Lyman-α emission line started in the early 1980s, when solid-state electronic imaging detectors replaced the traditional photographic plates. The electronic detectors, also known as charged-coupled devices, overcame the characteristic limited sensitivity and difficulty of calibration of the plates, which had so far limited the study of very faint objects. In the early 1990s, electronic detectors sensitive to infrared radiation also became available, which made it possible to extend the searches for Lyman-α emitters to higher redshifts than those probed by optical detectors. These observations were designed to be sensitive to sources characterized by the presence of a strong emission line. Some consisted of narrow-band imaging, namely of images taken through narrow filters that can isolate the line emission, if its redshift is such as to place it at the wavelength of the filter. The presence of the emission line is detected from galaxy’s relatively brighter apparent luminosity in the narrowband filter compared with that through a conventional filter. Other searches consisted of spectroscopy of regions of the sky and were designed to directly detect the emission line from the recorded spectra.

Interestingly, except for a handful of objects, Lyman-α emitters at high redshifts were not found, while numerous detections of relatively bright objects were expected if a whole population of galaxies (i.e. the spheroids) formed through the mechanism of gravitational collapse. The lack of detections provided support for competing models for the formation of the spheroids. These predict that spheroids were assembled through hierarchical merging of smaller subgalactic structures. These small systems themselves have formed through a gravitational collapse before they merged into larger galaxies. However, because they would be significantly smaller than the present-day bright spheroids, they are predicted to have been correspondingly fainter than massive spheroids at the time of their formation, in this way eluding detection.

Unfortunately, the lack of detections of bright Lymanα emitters at high redshifts did not conclusively constrain the mechanism of formation of the spheroids. On the one hand it did not prove the merging scenario. On the other hand it did not disprove the monolithic collapse scenario, since other explanations can be put forward to explain the lack of Lyman-α emitters. The most important of these is the presence of dust, which can form in nascent galaxies in a relatively short time, of the order of the free-fall time, as a result of the high rate of supernova events that take place in these systems. Even relatively small amounts of dust can have large effects on the Lyman-α luminosity, because this emission line is resonant. This means that, contrary to non-resonant lines, whose photons can travel long distances within the interstellar gas before interacting with other atoms, the Lyman-α photons are constantly being emitted, absorbed and re-emitted again by virtually all the hydrogen atoms that they encounter in their path. This greatly increases the geometrical path of the photons within the gas cloud before they can leave it. This, in turn, greatly increases the chances for the photons of being absorbed and destroyed by dust grains present in the cloud, resulting in a selective dimming of the luminosity of the galaxies at the Lyman-α wavelength.

Very recently, thanks to the extraordinary sensitivity of telescopes of large aperture, such as the 10 m Keck, Lyman-α emitting galaxies at high redshift have been eventually identified (Hu et al 1998). Interestingly, when sufficient numbers of these galaxies were observed that their abundances in space as a function of the luminosity (the so-called ‘Luminosity Function of Galaxies’) could be characterized with sufficient precision, it became clear that those luminous enough to be considered plausible protospheroids were too rare. By then, however, a massive population of star-forming galaxies at high redshift (with or without Lyman-α emission) had already been identified by means of another technique. While the properties of these galaxies provided an explanation for the paucity of Lyman-α emitters, they also provided plausible candidates for the present-day spheroids.

Acompletely different technique to search for star-forming galaxies at high redshift was proposed in the early 1990s. This technique exploits another major feature of the UV spectra of galaxies with ongoing star formation, namely the hydrogen ionization edge or Lyman limit.

Star-forming galaxies are very luminous at UV wavelengths and have a characteristic ‘blue’ spectrum, i.e. rich in radiation of short wavelengths, which is ‘flat’, namely the intensity of the light does not depend on the wavelength. In a diagram that plots the light intensity as a function of wavelength such a spectrum looks approximately horizontal, as the top panel of figure 2 shows. However, ionizing radiation, namely light with wavelength shorter than 912 Å, although copiously produced inside the galaxies, cannot escape them (and thus be observable), because it is entirely absorbed by the hydrogen gas, which is very abundant within and around star-forming galaxies. When a hydrogen atom is hit by a photon with wavelength shorter than 912 Å, it becomes ionized, that is its electron is stripped from the proton at the expense of the energy of the photon itself, which is destroyed.

As a result of this absorption of ionizing photons, the spectrum as recorded by an observer external to the galaxies has a very pronounced ‘dimming’ of more than an order of magnitude at wavelengths shorter than the Lyman limit. This is called the Lyman break (see figure 2, top panel). Additional absorption is also introduced by the intergalactic gas that the light from the galaxy travels through in its journey towards the observer. This increases the amplitude of the break even further (and also slightly attenuates the spectrum shortly before the break.)

The Lyman break abruptly interrupts the ‘flatness’ of the spectra of star-forming galaxies, and provides the telltale clue to identify them. For example, at redshifts around z ∼ 3 the Lyman break is shifted from 912 Å in the rest frame to ∼3700 Å in the observer’s frame, well into the optical band, observable from the ground. When observed through a set of filters that straddle the Lyman break, star-forming galaxies at those redshifts appear relatively bright in those filters that probe the spectrum longward of the break (the I and V filters of figure 2), they dim a little just before the break (the B filter), and are extremely faint (or not visible at all) in those filters that probe shortward of the break (the U filter). Thus, by accurately measuring the light of galaxies in a carefully selected set of filters, it is possible to cull high-redshift galaxy candidates from the much more abundant galaxies of similar apparent luminosity placed at modest distances. This is illustrated in the middle and bottom panels of figure 2, which show the transmittance curves of filters used for this technique and one of the high-redshift galaxy candidates found in this way, respectively. These candidates are then followed up with spectroscopic measurements to confirm their redshifts. In practice, the selection of candidates is efficiently done with telescopes of middle size (e.g. 4– 5 m), while the spectroscopic measurements to confirm the redshifts require telescopes of the 8 m class or larger. The candidate shown in figure 2 has been confirmed to be at redshift z = 2.8.

The wavelength of the filter bandpass is what determines the redshift range of the candidates. The filter suite shown in the figure is very sensitive in the redshift range 2 � z � 3.5. However, by excluding the U band and using the B band in lieu to probe shortward of the Lyman limit, together with the V and I bands to probe longward of it, one can target higher redshift intervals, which in this case 3.5 � z � 4.5. Other redder filters can be used for even higher redshift intervals.

The ‘Lyman-break technique’ turned out to be very sensitive and efficient. Although the exact numbers depend on the targeted redshift range, the fraction of the candidates that are confirmed at high redshifts by the spectroscopic measures is very high. For example, galaxies in the range 2 � z � 3.5 are identified with essentially 100% efficiency. In the range 3.5 � z � 4.5 the efficiency is still ∼50%. Two major surveys have been made using the Lyman-break technique. The largest one has been carried out by Steidel and collaborators (Steidel et al 1996, 1999) using ground-based telescopes for both the selection of the candidates and the spectroscopic confirmation. To date, this survey includes about 2000 candidates at z ∼ 3, of which ∼1000 have spectroscopically measured redshifts, and another ∼300 candidates at z ∼ 4 with ∼60 spectroscopic redshifts.

Another important survey has been made from space using the data collected by the Hubble Space Telescope during the observations of the HDF (HUBBLE DEEP FIELD) survey (Madau et al 1996). Although significantly smaller in size than the ground-based survey owing to the limited coverage of sky area, the HDF has allowed the identification of significantly fainter Lyman-break galaxies than possible from the ground, allowing researchers to study how the properties of these systems change with their luminosity. Because the archive is freely accessible to the world community, several groups have used the HDF data to identify high-redshift galaxies, and there have been reports of possible detection of galaxies with redshifts as high as z ∼ 5.5 in the HDF.

The samples of high-redshift galaxies made available by the Lyman-break technique have eventually allowed researchers to carry out empirical studies of their properties, opening the distant universe to the entire investigation. These studies include both statistical analysis of the large samples themselves as well as follow-up observations, including high-angular resolution imaging with the Hubble Space Telescope and imaging and spectroscopy at near-infrared wavelengths (to study the rest-frame optical ones) from large ground-based facilities and again from HST. This work has allowed us to test our fundamental ideas on galaxy and structure formation on an empirical basis.

One important thing must be kept in mind when interpreting the galaxies’ properties. The UV luminosity of star-forming galaxies (i.e. the intensity of the UV spectrum) is powered by the emission of young, massive stars, those approximately 20 times more massive than our Sun, while less massive stars are responsible for the optical and infrared luminosity. Massive stars are produced during star formation together with less massive ones, but, in contrast to the latter, which can live up to several billion years (depending on their mass), they only live about a million year or so, and then they explode as SUPERNOVAE. If the star formation continues in a galaxy at a steady rate, new massive stars constantly replace the dead ones, and the UV luminosity of the galaxy remains constant. The higher the star formation rate, namely the higher the number of massive stars formed in a given time, the larger the UV luminosity of the galaxy. At the same time low-mass stars keep piling up, increasing the optical luminosity of the galaxies. If star formation ceases, massive stars die off within a few 10exp6 yr, and the UV luminosity fades away. The galaxy, however, remains visible at optical and infrared wavelengths because of the long-lived less-massive stars. The longer the duration of the star formation phase, the larger the amount of small-mass stars formed and the higher the optical and infrared luminosity. Thus, the UV luminosity of star-forming galaxies is a direct measure of their star formation rate, while the optical luminosity is linked to the amount of small stars that have been formed.

The assembly of galaxy structures

High-resolution images at optical wavelengths obtained with HST have revealed a variety of morphologies and sizes (Giavalisco et al 1996). Since at these high redshifts optical images probe the UV light, they provide information on the regions with active star formation. The images show that some galaxies are compact systems, with relatively smooth and regular morphology that bears a pronounced resemblance to the spheroids observed in the present-day universe. One example of such galaxies is shown in figure 3. Other galaxies are also regular, but they have more diffuse light profiles (i.e. the variation of the brightness from the center to the outer regions) and look more similar to present-day spirals, although familiar features of these systems such as the spiral arms have not been identified at high redshift. Still other galaxies are irregular and fragmented, often showing multiple compact nuclei embedded in diffuse nebulosity, and overall morphology that cannot be classified in terms of the traditional galaxy types.

This diversity points to a variety of formation mechanisms. The compact, spheroidal galaxies can be explained if they formed as a result of a violent dynamical process, either a gravitational collapse of a proto-cloud or the merging of a number of discrete subunits, the more diffuse ones through the constant accretion of gas onto a rotationally supported disk, where it is converted into stars. In either case, the interesting fact is that the light profile of these galaxies strongly suggests that they are dynamically evolved and stable systems. At the observed rates of star formation and if left undisturbed in the course of evolution, they will have evolved into what are today elliptical and spiral galaxies of medium mass and luminosity, respectively. The fact that intense star formation seems to be occurring in these systems after the main dynamical event that gave origin to their structure (and most likely triggered the star formation activity) took place suggests that continued hierarchical merging after redshift z ∼ 3 is not necessary for the formation of some galaxies. One prediction in this case is that the mass of the regular Lyman-break galaxies is similar to that of present-day elliptical and spiral galaxies.

On the other hand, the irregular and fragmented morphology of other Lyman-break galaxies suggests that intense star formation occurs during interactions and merging events. In this case the expectation is that the mass of the forming galaxies progressively increases during the evolution (i.e. at smaller redshifts) and hence Lyman-break galaxies at z ∼ 3 should also include small, submassive systems.

Direct measurements of the mass of distant galaxies are very difficult at the present, even with the most powerful of the current telescopes, whereas they will be routine with large-aperture space telescopes such as the proposed NASA 8 m Next Generation Space Telescope. Great progress in understanding the mechanisms of galaxy formation is expected with such a facility.

The cosmic evolution of star formation

One of the most important results that came out of the discovery of galaxies at high redshift is the possibility to reconstruct the evolution of the activity of star formation in the universe over a very large stretch of cosmic time. The temporal interval currently probed extends from the present epoch to when the universe was about 10% of its current age.

The amount of stars being produced in the universe at any given epoch can be estimated from the abundance and UV luminosity of star-forming galaxies at that epoch. The abundance is measured by their volume density, namely the number of galaxies in a given volume of space, for example 1 Mpc3 (2.9 × 10exp67 m3), while the luminosity is directly obtained by the photometry of the sources detected in the images. The counts of Lyman-break galaxies down to the faintest available UV luminosity, therefore, can be used to derive a measure of the UV luminosity density. Recalling that the UV luminosity is proportional to the star formation rate, this can then be expressed as star formation density, namely the amount of stellar mass formed in a given volume of the universe at a given redshift. A practical unit of measure for the cosmic star formation density is expressed as solar mass per year per cubic megaparsec, or Mo yr−1 Mpc−1.

The diagram in figure 4 (top) shows the cosmic star formation density plotted (on a logarithmic scale) as a function of redshift. The different symbols indicate different surveys that have measured the cosmic star formation density in a variety of redshifts intervals, marked by the horizontal bars. Also shown are the uncertainties on the measurements, marked by the vertical bars.

Taken at face value, the diagram suggests that cosmic star formation activity started sometime just before redshift z ∼ 5, gradually increased towards smaller redshifts, reaching a peak at redshifts z ∼ 1 (about 1/3 of the cosmic age after the big bang) and then rapidly Mpc on a side, as a function of redshift. The y-axis is plotted on a logarithmic scale, and the mass is measured in units of the solar mass, i.e. 2 × 10exp33 g. The vertical bars for each data point represent the uncertainties on the measurements. The horizontal bars represent the redshift intervals covered by each data point. The top panel shows the data as observed; those in the bottom panel have been corrected for dust absorption, assuming that the properties of the dust do not change with redshift and are the same as observed in the local universe. Notice that the uncorrected data seem to suggest that the cosmic star formation began sometime prior to redshift z ∼ 5, progressively increased until it reached a peak at z ∼ 1 and then sharply decreased towards the present epoch, i.e. z = 0. The corrected data, however, do not support this interpretation, suggesting instead that the epoch of the onset of star formation in the universe has not yet been observed by the current data, and must be searched for at redshifts higher than z ∼ 5 decreased towards the present epoch (i.e. z = 0) to the same levels as at z ∼ 5. If confirmed, this would be a very interesting result, because it would empirically support the idea that during its early evolution the universe experienced a time when it contained no stars (dark age).

However, these measurements must be taken with great caution, since they can be subject to systematic errors that can bias the results. As should be clear, the measures of abundance are of a statistical nature, since they are based on galaxy counts, and therefore they rely on the assumption that the observed samples at high redshift are fair representations of the population of star forming galaxies at their epochs. This would not be the case, however, if the samples were too small, because in this case they could be subject to a statistical fluctuations (cosmic variance) and be either overpopulated or underpopulated in galaxies, depending whether an overdensity or underdensity region, respectively, has been (by statistical chance) targeted.

Another more serious and insidious source of systematic error is that measures of UV luminosity can be affected by unknown amounts of dust obscuration. The interstellar medium of nascent galaxies becomes polluted by dust in short time scales, as this is produced in the supernova explosions that end the short lives of massive stars. Dust affects the observed luminosity of these sources, because it absorbs UV and optical radiation and converts it into infrared radiation. Moreover, the absorption is more severe for radiation with shorter wavelengths than with longer ones. As a result, the observed UV spectra are fainter and redder than they would be if dust were not present, and an observer would conclude from them that the galaxies are less luminous and hence forming stars at a lower rates than they actually are.

Qualitatively, the presence of dust in the Lyman-break galaxies at z∼ 3 and ∼4 has been revealed by a number of indicators. These include the observed UV spectra, which are systematically somewhat redder (richer in radiation with longer wavelengths) than those of known dust-free galaxies with similar properties. Another indicator is the intensity of the rest-frame optical emission lines (observed at near-IR wavelengths, because of the redshift) which are comparatively too strong for the observed UV luminosity, consistently indicating the presence of dust reddening. Quantitatively, however, it is extremely difficult to estimate the amount of obscuration suffered by the galaxies and, therefore, attempt a correction to recover the intrinsic luminosity and star formation rates. This is because the correction depends on the properties of the dust (the so-called ‘extinction curve’) and on the shape of the ‘unreddened’ spectra (i.e. those that would be observed without dust obscuration). Since these are not precisely known, the correction can only be computed after making some assumptions, and thus it will depend on them.

The bottom panel of figure 4 shows the evolution of the cosmic star formation activity after corrections for dust obscuration have been made, assuming that the properties of the dust and the intrinsic spectra of the galaxies at high redshift are similar to those of starburst galaxies in the local universe. If these assumptions are realistic, the correction at z∼ 3 is about a factor of 5, implying that the amount of star formation at those cosmic epochs was actually 5 times higher than what one would naively derive using the observed data without any dust correction. Changing the assumptions to cover all the known cases of dust properties and spectra of star-forming galaxies would change the correction, making it as small as a factor of 2 or as large as a factor of 15.

Irrespective of the exact value of the correction, however, the important fact is that there is no obvious evidence for a decline of the cosmic star formation activity towards high redshifts (i.e. past z∼ 1) when some amount of dust correction is attempted. This means that, going toward the highest redshifts probed by the current galaxy surveys, namely z ∼ 4.5or ∼10% of the cosmic age, really there is no evidence that the cosmic star formation activity is decreasing from the level reached at z∼ 1. In other words, there is no evidence from the current data that, going from large redshifts toward small ones, we are seeing the end of the dark era and the beginning of the epoch of star formation, namely the transition from the epoch when the universe did not contain stars in appreciable quantities to when it started to efficiently form them. It seems now very likely that to detect and study this transition we will have to identify galaxies at even higher redshifts than we currently are capable of doing.

Are there more high-redshift galaxies?

One interesting question to ask is whether the Lyman-break technique, i.e. the selection of distant galaxies from their UV emission, returns all the galaxies that are physically present at the targeted redshifts. Star-forming galaxies with a large amount of dust or old galaxies, namely with previously formed stellar populations and no star formation activity, have very little UV luminosity, if any at all, and cannot be found with the Lyman-break technique. Discovering such galaxies at very high redshifts (say, z>3), if they exist, would have enormous consequences. On the one hand, if UV-dark star-forming galaxies are present in large numbers, this means that we have severely underestimated the amount of stars formed early in the universe. This implies that today there are many more stars and heavy elements2 than we currently observe in the universe. On the other hand, if old galaxies are already present at very high redshifts, this would imply either that we have underestimated the age of the universe or that we do not understand very well the physical conditions of the early universe or time scale of stellar evolution.

Recent imaging observations at sub-millimetric wavelengths (Lilly et al 1999) have unveiled a population of sources with properties consistent with star-forming galaxies at high redshift whose UV radiation is being either partially or completely absorbed by dust and re-emitted at far-infrared wavelengths (observed as submillimetric wavelengths because of the redshift). Interestingly, some of these sources have been identified as Lyman-break galaxies at z ∼ 3 (Barger et al 1999) and, conversely, Lyman-break galaxies at z ∼ 3 have been observed as submillimetric sources (Chapman et al 2000). Many others have eluded the efforts to identify them and assign them a redshift, and at present the redshift distribution of the submillimeter population is unknown. Of course, similarly unknown is the extent to which the Lyman-break galaxies and the submillimeter galaxies overlap, i.e. are the same objects. Assuming that the latter are placed at z>2 and adding together all their luminosities yields a total energy output at infrared wavelengths that is a factor of several larger than that of the energy output at UV wavelengths by the Lyman-break galaxies. We have seen, however, that the UV luminosity of these sources is likely to be underestimated by a factor of several owing to dust obscuration, and, when a correction is included, the two contributions become comparable. This is consistent with the possibility that the two galaxy populations are the same one. In such a case the submillimetric emission would be powered by the UV light absorbed by dust and re-emitted at infrared wavelengths3.

Unfortunately, with the current instrumentation it is not possible to identify the nature of the submillimetric sources with great confidence, and establish whether they have or do not have UV emission, which from high redshifts would be observed at optical wavelengths. This is largely because the angular resolution of the images is very coarse (about 15 seconds of arc) and many faint optical sources can be found within a circular region of the sky with such a large diameter, making the identification with optical galaxies very difficult. Future generations of submillimeter telescopes, such as the ALMA project that will be commissioned in Chile in 2007 as a joint collaborative project of the US, Europe and Japan, will have the sensitivity and resolving power to accurately image the sources and allow the identification with optical ones (or demonstrate that optical counterparts are very rare). It will be very interesting to see whether current UV survey have accounted for all the star formation activity in the young universe, whether most of it has actually been hidden by dust, or else if the universe was already populated by ‘old’ objects when it was only ∼10% or less of its current age.

Galaxy formation and dark matter

Another fundamental avenue of research opened by the identification of forming galaxies at high redshift is the possibility of testing the idea that gravity has been the force responsible for galaxy and structure formation.

If gravity has assembled the cosmic structures, then one prediction is that galaxies have formed in those regions of space where enough mass had condensed and produced the gravitational pull to confine the gas in a relatively small volume of space, promoting and facilitating its conversion into stars.

What are these mass condensations that seeded galaxy and star formation? There is compelling evidence that the majority of the mass present in the universe (about 90% of it) is not in the form of visible matter but is dark, either because it is cold or because it interacts very weakly with the electromagnetic radiation. Its presence, however, is detected through the gravitational effects, such as motion, that it induces in other, visible objects. The nature of the dark matter is unknown. It has been proposed that it might consist of ordinary matter such as objects similar to planets or unborn stars (brown dwarfs), or of more exotic types of matter, such as subatomic particles (e.g. massive neutrinos), other species of particles predicted by quantum theories (e.g. axions, photinos, gravitinos) or even of such systems as primordial mini black holes. It is beyond the scope of this article to discuss dark matter in detail, and in the following we shall refer to it as a ‘cosmological fluid’ that, together with the regular visible matter, fills the space in the universe, and consider only its gravitational effects.

If gravity has assembled structures out of the cosmic fluid, then one specific prediction of the theory is that the spatial distribution of these structures (we shall refer to these structures as ‘the halos’) is not homogeneous, but highly clustered. In other words, the halos are not expected to be distributed in space at random, but preferentially located next to each other, defining local concentrations that are also preferentially found next to other concentrations, and so on. Thus, if the visible part of galaxies (i.e. the stars) forms when gas condenses inside the dark matter halos, then the spatial distribution of forming galaxies must be the same as that expected for the halos themselves.

One of the most remarkable properties of the Lyman-break galaxies is that their spatial distribution is, within the uncertainties of the measurements, the same as that expected for the dark matter halos (Giavalisco et al 1998; Adelberger et al 1998). Figure 5 illustrates the strong spatial clustering of the galaxies. It shows the probability of finding a pair of galaxies separated by an angle θ in the sky as a function of the angle itself (measured in seconds of arc). This quantity is known as the angular correlation function and it is traditionally represented by the symbol ω(θ). The data show that pairs of Lyman-break galaxies separated by small angles in the sky are much more likely to be found than pairs with large separations. This means that the galaxies have a strong tendency to cluster in space, namely to be physically closer to each other than in a homogeneous (random) distribution. If the galaxies were randomly distributed in space, the function ω(θ) would have been flat, namely equal to a constant numerical value for any angular separation, indicating that pairs could have been found equally likely at all angular separations.

Because the distances to the galaxies are known, it is possible to transform the angular separations in the sky into physical separations in space and derive the spatial scales over which Lyman-break galaxies are clustered. These scales turn out to be of the order of 3–5 Mpc, the exact values depending on the Hubble constant H0 and the cosmological parameters Q and Q1, which are not yet precisely known. These observed scales have been found to be the same as those predicted by the theory of gravitational instability. The theory also predicts the abundances of the galaxies (i.e. the number of galaxies in a given volume of space), and, remarkably, the observed numbers of galaxies also agree very well with the predicted ones.

The predictions of the theory of gravitational instability concern structures that form out of the ‘cosmic fluid’ through the action of gravity. The simultaneous agreement of the observed clustering strength and spatial abundances of the Lyman-break galaxies with the analogous quantities predicted for the dark matter halos shows that the visible component of galaxies has the same properties predicted for the halos. This is consistent with the idea that visible galaxies and dark halos are physically associated and in fact are the same structures. This evidence provides strong empirical support for the notion that star formation takes place within the gravitational potential provided by the halos (recall that the dark matter accounts for ∼90% of the mass), which would act as ‘condensation seeds’, making it possible for the gas to condense and transform into stars. The observed clustering properties and abundances of Lyman-break galaxies represent a remarkable success for the theory and show that main ideas behind the paradigm of galaxy formation are generally robust.