Data Products

Data Products

    Data Products


    Interferometer Data

    Pre-processing pipeline: For each observing beam a set of uncalibrated correlated visibilities is provided. The data products are stored in Measurement Sets (MS), which can be archived to the LTA. Data, written in the DATA column, have been flagged, demixed (if requested) and averaged in time and frequency according to the user specification.

    Pre-Factor pipeline: ASTRON is currently implementing in production PreFactor, the direction-independent calibration pipeline, which produces direction-independent calibrated visibilities, wide-band images of the target field, calibration solutions and diagnostic plots. While PreFactor cannot yet be widely offered in Cycle 17, ASTRON will select a sample of appropriate projects that will be offered the opportunity to obtain data products processed through this pipeline. Pre-Factor consists of 3 pipelines: calibrator, target and imaging. For each target beam a set of direction-independent calibrated visibilities is provided. The data products are stored in Measurement Sets (MS), which can be archived to the LTA. Also for each observed target beam a full bandwidth image will be produced, which will be archived in the LTA. Furthermore calibration solutions and diagnostic plots of each PreFactor pipeline will be made available.


    Beam-formed Data

    Raw beam-formed data: These are in the form of two files, a 'raw' file containing the raw data in a binary format and an hdf5 file containing the meta-data and a data array linking to the raw data.   Thus the most straightforward way to access the data is to use hdf5 tools and open only the 'h5' file.  Data can then be found in the sub-folder, '/SUB_ARRAY_POINTING_xxx/BEAM_xxx/STOKES_x' with meta-data stored as attributes of the root folder and each sub-folder.

    Pulsar data products: These are written in a pulsar-standard PRESTO format, generated by the Known Pulsar Pipeline developed by the LOFAR Pulsar Working Group.  Potential users are advised to contact any member of this group for details of this format.


    Transient Buffer Board Data

    Data is stored as Raw Voltages per station in HDF5 format, including some of the metadata. The software package to access the data and do some processing (eg. FFT, RFI mitigation, ...) with python scripts, PyCRTools, is available at CEP.


    Data Releases

    Data releases from different LOFAR surveys and collaborations can be found here. Documentation describing the observations, data processing as well as the retreival of the data for each data release is described within.


    The Long-Term Archive (LTA)

    Processed data products and, if requested and granted, raw data products are stored in the LOFAR long-term archive.  The LTA currently involves sites in the Netherlands, Germany, and Poland.  For astronomers, the LOFAR LTA provides the principal interface to LOFAR data retrieval and data mining. In the future, facilities to further process these data are also expected to be available.

    Data can be downloaded by a user associated with the project using the LOFAR-LTA interface. Initially, data can only be accessed by a user associated with a given project but, after one year, the data become public (see LOFAR data policy for further details).  This is the main way in which a user will access their data following an observation.  However, in the event that the data cannot be archived in this way, the data may be transferred to the CEP3 user cluster from where the user has four weeks in which to download the data to their own facilities.

    Since August 2017, after each LOFAR observation, inspection plots are routinely generated, including station dynamic spectra. These plots are used by ASTRON SDCO staff to check the quality of the raw data. They only remain online for 3 weeks from the date of observation. After that, they are compressed and transferred to offline storage. If you wish to access the inspection plots before downloading an observation or pipeline, please contact SDCO staff via the SDC-helpdesk, and provide the project code and SAS ID of the observation you are interested in. Since June 2019 (the beginning of Cycle 12), the data quality report written by SDCO staff based on the inspection plots can also be made available upon request.

    Full details on how to use the LTA and retrieve LOFAR data can be found on the LTA public wiki page. Given the potential quantities of data involved, it is strongly recommended to use Grid facilities and download via SRM if these tools are available.

    Users can also use processing resources offered by the LTA sites to further process their data - see here

    LTA usage statistics per LOFAR cycle

    Since May 2013 (start of Cycle 0), about 32 PB of LOFAR data were downloaded from the LTA. There has been a sharp drop in the volume of downloaded data during the past few semesters, as shown in the plot below. This is likely due, in large part, to the use of Dysco compression, which reduces the volume of interferometric data by about a factor of 3.6.

    Averaging over all semesters, the fraction of data downloaded by non-proprietary users after the data became publicly available is 16%. This fraction has increased significantly during the past few semesters. This is expected as more data have become publicly available, including data from the LOFAR Two-metre Sky Survey (LoTSS), which will eventually cover the whole northern sky.


    Volume of proprietary and non-proprietary data downloaded from the LTA in each LOFAR cycle. The last bin (2021 H1) only includes data until 11 May 2021.


    Since 2017, there has been a strong increase in the volume of downloaded interferometric data. This is around the time when data processing was started on the grid at SURFsara for LoTSS as well as LoTSS co-observing projects. The drop in the volume of downloaded beamformed data during the past few semesters reflects the increase in the proportion of interferometric observations.


    Volume of beamformed and interferometric data downloaded from the LTA. The last bin (2021 H1) only includes data until 11 May 2021.


    There is a caveat in that data retrieved from the LTA in a non-standard way, e.g. obtained directly from disk without having to stage the data, are not tracked in these plots. Also, some beamformed expert users based at ASTRON do not download their data from the LTA but copy it from CEP4. The data from some communities may therefore be underestimated/excluded.


    SDC Helpdesk