LOFAR Data Archive

LOFAR Data Archive

    Long Term Archive User Manual

    This is a short manual on how to search for and retrieve data from the LOFAR Long Term Archive (LTA).

    To access the LTA, go to: lta.lofar.eu

    In case of problems, please refer to the Frequently Asked Questions section below.

    All use of LOFAR data, whether public or otherwise, must adhere to the LOFAR data policy. Note that any publications resulting from use of LOFAR data must include one of the standard acknowledgement phrases given at the end of this policy document.

     

    User Access

    Lost password

    In the case of a lost password, please visit https://webportal.astron.nl/pwm/public/ForgottenPassword to request a new one.

    Searching & retrieving data

    The LTA catalogue can be searched directly without needing an account. Access to all projects and search queries will return results of the entire catalogue because metadata are public for all LTA content.

    Staging and subsequent downloading of public data always requires an account with LTA user access privileges. Users that are a member of a submitted LOFAR proposal can use the associated account for accessing the LTA. If you do not have an account yet, you can register through ”Create account“. LTA access privileges will need to be granted by the Science Data Centre Operations (SDCO). Please, once you have created your account, submit a support request to the ASTRON SDC Helpdesk asking for LTA privileges and clearly state your username.

    To stage and retrieve project-related data in the LTA which are proprietary, you need to have a LOFAR account that is enabled for the archive and coupled with the projects of interest. In case your account should be coupled with a project and currently is not, you can request the SDCO to add you to the list of co-authors of the project. When you send such a request, you must add the project's PI in cc. After SDCO adds you to the project, you might get an email asking you to set a new password in ASTRON Web Applications Password Self Service. Please note that this will set a new password not just for the LTA but for the other LOFAR services as well.

    Please read the LOFAR Data Policy for more information about proprietary vs public data.

    Data navigation in the archive

    Page navigation

    The LTA menu, as shown below, gives access to the main functionalities.

    Users can log in to the LTA through the LOGIN button at the top right.

    To start a search in the LTA, click on the SEARCH DATA button on the menu. This will, by default, show the basic search page, where users can select the data product type of interest and perform a cone search. An advanced search mode, with more parameters per data type, can also be selected by clicking on the drop menu on the left side.

    A “project” can be selected by clicking on the BROWSE PROJECTS button on the menu. This is particularly useful for project related searches, as well as for checking either public projects or user's co-author membership. At this stage, several actions are allowed. A quick overview of how to search data is below, followed by how to stage and download the selected data.

    Quick guide to search and retrieve data

    Basic search

    The Basic Search module allows searching for data within a specified pointing (coordinates) and specifying whether to perform a search on observations and/or pipelines.
    Several reference systems are available. In particular: to search for Solar datasets the Sun reference system should be selected in combination with the type of process of interest (e.g. Observation and/or Averaging Pipeline) before pressing the Search button. Note that when a project has been preselected, the search will be confined to only that project.

    • Log in to https://lta.lofar.eu/.
    • Click SEARCH DATA in the top menu.
    • Specify the data product types of interest and a target name or coordinates.
    • Click on “Search” button at the bottom.
    • From the screen that follows, you should be able to stage the data products by selecting the desired data through the check marks and clicking 'Stage selected'.

     

    Advanced search

    The Advanced Search modules allow you to specify coordinates and specific parameters of the observation or pipeline products that you are looking for.
    To discover Solar datasets, users should select the Sun reference system, together with a specification of the parameters of interest before pressing the Search button.
    A search on observations will return the setup of the telescope at the time of observing, but may not return any downloadable data. Typically, only pipeline products are archived and these can be directly searched for by selecting the Pipeline modules. If observations are selected for the query, it will be possible to find and select related pipelines on the results page.

    • log in to https://lta.lofar.eu/.
    • click SEARCH DATA in the top menu.
    • click on the side panel Advanced Search drop-down list.
    • specify the data product types of interest from the drop down list.
    • select products features and specify a target name, coordinates or SAS ID (unique identifier for a given observation/pipeline run).
    • click on the “Search” button at the bottom.
    • from the screen that follows, you should be able to stage the data products by selecting the desired data through the check marks and clicking 'Stage selected' and 'Submit'.

    This figure shows one of the different drop-down options for Advanced Search, which will return any observation for a known SAS ID.

     

    Project search (to restrict all data searches to that project only)

    Under the Project Search menu is a table showing projects that can be selected to restrict all next data searches to that project only. Use the “Search” button to select the project and go to the search page, use the "Show data" button to select the project and to show all data in it. Alternatively, click on the project name to view the project details. The first column shows if you are a member of the project or if the project is public.

    • log in to https://lta.lofar.eu/.
    • click BROWSE PROJECTS in the top menu.
    • Select which cycle the project is part of.
    • Actions in this menu include:
      • Clicking on the project name to view the project details and eventually select it.
      • Using the "Search" button to select the project and go to the search page. Subsequent searches will then be done within that project.
      • Using the "Show data" button to select the project and to show all data in it.
    • from the screen(s) that follow(s), you should be able to either search / select / stage the data products.

     

    Special search tricks

    There are some useful ways to find your data in the LTA that allow for easy selecting and navigation of your data. This section will elaborate a few more advanced options for browsing the LTA.

    Queries

    While using the Advanced Search tab, you can select a range of SAS IDs by using colons. For example if you want all observations and pipelines that have SAS ID in the range of 432000 to 432190, you can supply it as: 432000:432190 in the search box. Similarly you can insert a comma separated query for a (nonsuccessive) list of SAS IDs. Note that you can query only up to a 100 items at a time.

    Edit columns

    Once you have submitted a search query, you land on a page showing a table-like overview of the submitted object/SAS IDs. This table has several headers that change depending on the type of data that you searched for. For a more advanced way to inspect the returned query, you can change the columns displayed through the "edit columns" button. This will make a new window pop up with a range of options that you can select or deselect. In order to apply the changes, you must click the submit button at the very bottom of the checklist.

     

    Viewing data

    When you are looking at the results of your query, there are several different menus and levels you can inspect. For each searched SAS ID or object, you can dive into the blue-coloured entries. One of the most useful and important entries is the "data products" tab. Once you click on 'data products', it shows all the subbands or pointings that were part of this SAS ID and each can be individually selected for download. Similarly, if you searched for Observation Data through the Basic search tab that only has pipeline products on the LTA, you can navigate from the Observation information of that SAS ID to the downloadable pipeline products.

     

    DBView

    For advanced use cases, a service is available at https://lta-dbview.lofar.eu/ that gives the option to run your own queries on the database or build them using a tables view. General help on usage is accessible from the service page.

    A potentially useful query is shown below, that gives you all files for a certain SAS ID.

    SELECT fo.URI, dp."dataProductType", dp."dataProductIdentifier",
     dp."processIdentifier"
    FROM AWOPER."DataProduct+" dp,
         AWOPER.FileObject fo,
         AWOPER."Process+" pr
    WHERE dp."processIdentifier" = pr."processIdentifier"
      AND pr."observationId" = '123456'
      AND fo.data_object = dp."object_id"
      AND dp."isValid"> 0
    

    In this '123456' should be replaced with the SAS ID of an Observation/Pipeline you are looking for. Even though this is a bit confusing, pipeline SAS IDs are stored under "observationId", just as for observations. To be able to run this query, you have to go to the link above, login as the right user, select the right project, and then put this query into the “Manual SQL”.

    Example: You can also modify these queries. Available tables and fields can be inspected by following the "Tables" link at the top of the DBView page. For example if you want to also know the MD5 checksum, you can run:

    SELECT fo.URI, fo.hash_md5, dp."dataProductType", dp."dataProductIdentifier",
     dp."processIdentifier"
    FROM AWOPER."DataProduct+" dp,
         AWOPER.FileObject fo,
         AWOPER."Process+" pr
    WHERE dp."processIdentifier" = pr."processIdentifier"
      AND pr."observationId" = '123456'
      AND fo.data_object = dp."object_id"
      AND dp."isValid"> 0
    

     

    Identifying Data for Download

    Depending on the search parameters, e.g., which data products were requested (observation, pipeline), lists of observations and/or pipelines will be returned (see observations example above). Note that not all observations and pipelines will have archived data that can be retrieved. Please verify availability of data first. If the data is available and the user has access/rights to download this data, then there are several options to select data for staging:

    1. select observations/pipelines and stage (prepare for download) all data related to the selection.
    2. select observations and “show pipelines” related to the observations, then select pipelines and stage.
    3. select observations/pipelines, “show data products” related to the selection, possibly filter the data products (to have smaller selections) and then select and stage the data products.

    Note that observations often have no raw data in the archive, but the metadata is visible because subsequent pipelines have processed the raw data further which are then ingested into the LTA. To get to the pipelines related to observations, use “Show Pipelines”.

    To see whether observations or pipelines have data products in the LTA, look for the “Number of Correlated/BeamFormed Data Products” column. These columns, as well as a few others, can also be used to navigate to the relevant data products.

    Once you have a list of data products on your screen, the “Release Date” will tell you when the data are available for public download. If the data is public, or you are a member of the project, the “checkbox” column will be selectable and staging can proceed. You can also hover with your mouse over the checkbox and get more information, like the size, location and checksums.

    Unspecified Data/Process

    Some data could have had problems somewhere in the automation and control part of the LOFAR software during observation or processing. Sometimes a few subbands might be affected, sometimes an entire observation. Science Data Centre Operations will check the data, (re)run things manually or fix things if needed and then archive the data. This does mean that the automation and control sometimes loses track of the files and the archiving process has no information beyond the Observation ID and filename itself. In such cases, a few subbands or an entire observation might end up under “Unspecified Process”. We do attempt to fix things at a later date, but that is not always feasible. If the files were archived, the data itself is usable. What is missing is the information that the LTA needs to properly label and query the data.

    If an Observation is missing, or is missing subbands, please check if it ended up under Unspecified.

     

    Staging Data (prepare for download)

    Once you have a list of data products, observations or pipelines, you can use the check boxes to select which files you want to download. The first check box can be used to select or deselect all files/observations on a page.

    The LOFAR Archive stores data on magnetic tape. This means that it cannot be downloaded right away; it has to be copied from tape to disk first. This process is called 'staging'.

    When you have made your selection of files, click on stage selected. This shows you the following message shown below, where you can press submit. It means that a request has been sent to the LTA staging service to start retrieving the requested files from the tape and make them available on disk. You will get a confirmation e-mail, to acknowledge that your staging request was received and the process was queued. When the files are staged, you will get a notification email informing you that your data are ready for retrieval.

    The e-mail that you get when the staging on disk is complete gives you a list of files and has several attachments. Amongst them are two files html.txt and srm.txt:

    There are two different ways to download your files with these attachments: http and srm.

    We also attach plain lists of the files/SURLs that were scheduled for staging (in the confirmation mail), those that were successfully staged, and those that could not be staged (in the success / partial success notifications).

    Please take note of the following

    1. Unless you have an extremely fast connection (10 Gbit/s or more), it is in general advisable to stage no more than 5 TB at a time (see also point 4). At maximum efficiency a 1 Gbit/s connection will already take 12 hours to retrieve 5 TB of data, in practice it will often take quite a bit more.
    2. On a 1 Gbit/s connection as a general rule of thumb, you should be able to retrieve data at about 100-500 GB/hour, especially if you try to retrieve 4-8 files concurrently. If you see speeds much lower than this, you might have some kind of network problem and should in general contact your IT staff.
    3. Staging the data from tape to disk might take quite a bit of time. In the large data centres that the LTA uses, the tape drives are shared with all users and requests are queued. This is not just for users of LOFAR, but also for users associated with other projects that have their data stored in the data centers. This might mean that it takes anywhere from a few hours to days to stage a copy of your data from tape to disk.
    4. The amount of space available for staging data is limited although quite large. This space is however shared between all LOFAR LTA users. This includes LTA operations for buffering data from CEP to the LTA before it gets moved to tape. If many users are staging data at the same time, and/or SDC operations is transferring large amounts of data, the system might temporarily run low on disk space. You might then get a message that your request was only partially successful. In general the request will still finish 1-2 days later and we do monitor if requests get stuck and restart if needed.
    5. We strive to keep a copy of data that was staged on disk for 1-2 weeks so you have some time to download it. After that it might get removed to make space for more recent requests. In accordance with the LOFAR data policies, the data will be available if you need to access it again at a later stage but you might need to stage a copy to disk again.
    6. We are continuously trying to improve the reliability and speed of the available services. Please contact SDCO if you have any problems or suggestions for improvement.
    7. The data centres that the LTA uses also have maintenance or small outages sometimes. SDCO can advise you if this is the case and when it is planned to end if you are having trouble accessing data. In general, outages that affect user data access will be announced on the main page of the LOFAR LTA website and will not be at the same dates as the LOFAR stop days.

    TBB data needs to be staged by hand. Please send a request at https://support.astron.nl/sdchelpdesk to stage the data for you, specifying the filenames to be staged. To download the data, please follow the instruction under Download Data for proper authentication. Data will then be available for download using:

    wget --no-check-certificate https://lofar-download.grid.surfsara.nl/lofigrid/SRMFifoGet.py?surl=<filename> .

    The filename should start with srm://

    You will need a valid LTA account to access this data.

     

    Download data

    You can download your requested data with the files attached to your e-mail notification. There are different possibilities and tools to do this. If you're unsure, which one to use, please refer to the according FAQ Answer at the bottom of this page.

    If you open html.txt, it contains a list of http links that you can feed to a unix command line tool like wget or curl or even use in a browser.

    For wget you can use the following command line:

    wget -i html.txt
    

    This will download the files in html.txt to the current directory (option '-i' reads the urls from the specified file).

    Preferably, especially when downloading large files, you should also use option '-c'. This will continue unfinished earlier downloads instead of starting a fresh download of the whole file. (Make sure to first delete existing files that contain error messages instead of data, if you use this option):

    wget -ci html.txt
    

    Note that wget does not overwrite existing files. If you use the continue option ('-c') it will append any missing parts to the existing file. If you don't use the continue option and there is a file present (e.g. from a stopped earlier download), wget creates a new file by appending a number (e.g., '.1') to the filename.

    NB: Do not set the username and password on the wget command line because this allows other users on the system to view them in the process list. Instead you should create a file ~/.wgetrc with two lines according to the following example:

    user=lofaruser
    password=secret
    

    NB: This is only an example, you have to edit the file and enter your own personal user name and password!

    Make sure that file access authorizations for the .wgetrc file only allow access by you (the owner) to avoid leaking your credentials. e.g.:

    chmod 600 .wgetrc
    

    There is no easy way to have wget rename the files as part of the command directly. It does not accept the -O flag inside a file it gets with -i. You can either rename files afterward, e.g. using the following command:

    find . -name "SRMFifoGet*" | awk -F %2F '{system("mv "$0" "$NF)}'
    

    or add the -O option to each line in html.txt but then feed each line to wget separately like this: cat html.txt | xargs wget. By default the html.txt file does not contain such options.

    The following Python script will take care of renaming and untarring the downloaded files:

    #!/usr/bin/env python

    """This script is a pipeline to untar raw MS. 1 input is needed: the working directory.
    AUTHOR: J.B.R. OONK (ASTRON/LEIDEN UNIV. 2015) EDITED BY: M.IACOBELLI (ASTRON)"""

    import glob, os, sys

    usage = 'Usage: %s inDIR' % sys.argv[0]
    try:
    path = sys.argv[1]
    except:
    print(usage) ; sys.exit(1)

    #path = "./" # FILE DIRECTORY

    filelist = sorted(glob.glob(path+'*.tar'))
    print('LIST:', filelist)

    #FILE STRING SEPARATORS
    sp1d='%'
    sp2d='2F'
    extn='.MS'
    extt='.tar'

    #LOOP
    print('##### STARTING THE LOOP #####')
    for infile_orig in filelist:

    #GET FILE
    infiletar = os.path.basename(infile_orig)
    infile = infiletar
    print('doing file: ', infile)

    spl1=infile.split(sp1d)[11]
    spl2=spl1.split(sp2d)[1]
    spl3=spl2.split(extn)[0]
    newname = spl3+extn+extt

    inFILE = path+infile
    newFILE = path+newname
    # SPECIFY FILE MV COMMAND
    command='mv ' + inFILE + ' ' +newFILE
    print(command)

    # CARRY OUT FILENAME CHANGE !!!
    # - COMMENT FOR TESTING OUTPUT
    # - UNCOMMENT TO PERFORM FILE MV COMMAND
    os.system(command)
    os.system('tar -xvf '+newFILE)
    os.system('rm -r '+newFILE)

    print('finished rename / untar / remove of: ', newFILE)

    The file srm.txt contains a list of srm locations which you would feed to a grid client like gfal-copy. SRM is a GRID specific protocol that is currently supported for data at the LTA locations. It is faster, especially if you have significantly more than 1 Gbit/s bandwidth. It requires a valid GRID certificate and installation of grid client tools such as the gfal suite.

    Contact SDC Operations via ASTRON SDC helpdesk if you think you might need a GRID account but it cannot be provided by your own institute.

    We advise users to look into using the gfal client library and command line tools. SURF provides good documentation on usage of grid storage client software, where they show examples of how some gfal scripts can be used.

     


     

    Below is a collection of frequently asked questions, that include general questions together with some troubleshoot questions.

    General

    • Why is data retrieval so difficult?
    • What is an appropriate amount of data to retrieve?
    • What is all this SRM/staging stuff about?
    • Do I have to make new requests via the web catalog?
    • There are different ways to download. Which one is the best?
    • My download speeds are too slow. What can I do?
    • I want to contact Science Data Centre Operations. What information should I include?
    Why is data retrieval so difficult?

    It is important to understand the data volumes of LOFAR are pretty huge and handling them requires different technologies than what we all know and use in our everyday life. For instance, LTA data is stored on magnetic tape and has to be copied to a hard drive (getting 'staged') before it can be retrieved. To transfer these amounts of data within reasonable time requires careful consideration and special tools. We try to make the LTA as convenient to use as possible, e.g. by providing http downloads for users without Grid certificate and supporting the use of Grid client tools for those who want or need the extra performance. We are aware of the fact that data retrieval is quite close to the backend technology and we hope to be able to provide solutions with higher abstraction in the future. But it will always be necessary to prepare data for download, so users are expected to plan a bit ahead (sorry!).

    What is an appropriate amount of data to retrieve?

    This depends. As a rule of thumb, we ask you to keep your requests below 5 TB in volume and less than a 1000 files. Also, the total file count in all your running requests should not exceed 5000 files at any point in time. Specifically, there are essentially two things to consider: the capabilities of your own system and the capabilities of the LTA services.

    The most important thing to know about LTA capabilities, is that the disk pool that temporarily holds your data and from where it can be downloaded, has a limited capacity. This means that the data you requested is only available for download for a limited time (since the space is needed for new requests at some point). Your data is only guaranteed to stay available for 7 days. It can be re-requested after that, but you should never request more data than you can download within a few days. In most cases this is limited by the capabilities of your own system, especially your network connection (and available local storage space, of course).

    The second most important LTA limit is the number of files that can be processed at the same time. Some projects do not have a lot of data volume, but the data is distributed over very many files. With large file counts, the management of the request itself puts a large load on the system. There is a maximum queue size of 10,000 files for all user requests together. So make sure to only occupy a fraction of that and wait until earlier submitted requests have finished (you get notified) before you submit new requests.

    Note that the larger your request, the longer it takes until you can retrieve the first file. Also, please limit the number of requests running in parallel to a few, especially when they contain many files. In principle, we avoid introducing hard limits, but rely on reasonable user behaviour. This also means that you can block the system for a long time or, in the worst case, even bring it down. So please act responsibly or we might have to enforce some limits in the future to keep the system available for other users. Be aware, that we may cancel your request(s) in excessive cases to maintain LTA operation.

    If you, by accident, staged some 100,000 files or 100 TB of data, please contact the ASTRON SDC helpdesk, so that we can stop these requests, thanks!

    What is all this SRM/staging stuff about?

    These are technical terms that refer to the storage backend of the LTA. Each of the three LTA sites (in Amsterdam, Juelich, and Poznan) operates an SRM (Storage Resource Management) system. Each SRM system consists of magnetic tape storage and hard disk storage. Both are addressed by a common file system, where each file has a specific locality: it can be either on disk ('online') or on tape ('nearline') or both. The usual case for LTA data is, that it is on tape only. Since the tape is not directly accessible but only through an (automated) tape library, the data on it first has to be copied from tape to disk, in order to retrieve it. This process is called 'staging'. Only while the data is (also) on disk, you will be able to download it. To save cost, the disk pool is of limited capacity and only meant for temporary caching data that a user wants to access. After 7 days, all data is automatically 'released', which means that it may be deleted from the disk storage, as soon as the space is required for other data. It then has to be staged again in order to become accessible again.

    Usually, you don't have to worry about the details. But be aware, that data retrieval is a two-step procedure: 1) preparation for download ('staging') and 2) the download itself. Also, take care not to request too much data at the same time.

    Do I have to make new requests via the web catalogue?

    In principle, yes, this is the only supported procedure, at the moment. There are advanced methods based on programming interfaces and libraries but these require a certain level of expertise to install and use (see this page for more information). When using the advanced access methods you are expected to take extra care to apply fair use practices and not to overly stress the systems. If you are an 'expert user', are self-dependent enough to figure out how to work with this, and have a good reason, please contact the ASTRON SDC helpdesk for some instructions and an emphatic admonition to take extra care.

    There are different ways to download data. Which one is the best?

    That depends. In short: Http downloads are the easiest (e.g. via wget), but downloads via SRM tools can be faster and are encouraged for large amounts.

    The SRM systems which the LTA sites operate are integrated in the Grid. To work with them directly, you need a Grid certificate. To allow users without a Grid certificate to download LTA data, we operate webservers as a frontend to the SRM backend. These webservers provide the requested data via http downloads. The webservers are not excessively capable machines and meant for occasional users. If you retrieve huge amounts of data on a regular timescale, please work with SRM directly, especially if you own a Grid certificate.

    You may want to read this FAQ Answer as well to make a decision: My downloads are too slow. What can I do?

    My download speeds are too slow. What can I do?

    First of all, you have to check how slow your download really is. If wget shows an estimated time of arrival of several hours, this does not necessarily mean that the download is 'slow': some files in the LTA are also just really huge. In most cases, your local network connection will be the bottleneck. For instance, a standard 'Fast Ethernet' network connection allows download speeds of around 120 MB/s at a maximum. Our systems are able to handle that, easily. In case you can rule out your network connection as the bottleneck: there are different ways to download your data and not all provide the same performance. By our experience, this is the order of performance:

    • Http downloads are the slowest option. The speed is limited by the server's network connection (~120 MB/s), which is shared by all users, and an upper limit per download (around ~30 MB/s) for technical reasons. If your download maxes out at the per-download limit, you may try to start up to four downloads in parallel. Note: There is no performance benefit to expect from more than four parallel transfers! However, there is a connection limit, which you may trigger if you start too many parallel downloads.
    • Using grid client tools is the faster option in most cases, since you work with the grid storage backend directly. You may want to check out active gridftp transfer mode if you live remote.

    You may also want to read this FAQ Answer for further explanation: There are different ways to download. Which one is the best

    I want to contact Science Data Centre Operations. What information should I include?

    You are welcome to contact Science Data Centre Operations in case of problems that you cannot solve yourself. However, we kindly ask you to include all important information in your inquiry, so that we can quickly help you with your problem without too much back and forth:

    • It is absolutely essential, that you include a clear answer to the following:
      1. What exactly did you try to do?
      2. What went wrong?
      3. When exactly did it fail (so we can check the logs)?
    • If you are asking about a command that failed, please copy-paste the exact command that you executed together with the full terminal output. Some tools have a '-debug' option, which provides additional information, e.g. about your environment. It helps a lot if you could use that option when you copy-paste your command output.
    • If you are using some script somebody gave you, please note that we are no clairvoyants and have no idea what the script you're using actually does. We can most likely not understand what went wrong from the output of some random script. Please check this page carefully, whether the officially supported ways of data retrieval work for you. If they work, please ask the one who supplied you with the failing script, why their script fails. If the official ways don't work, please forget about your script for a moment and provide the output of the official tool that does not work for you.

     

    Troubleshoot

    • I did not receive a mail notification that my request was scheduled!
    • I did not receive a mail notification that my data is ready for retrieval! Has my request gone lost
    • I got an email that says my staging request has failed! What happened
    • I got an email that says my staging request was only partially successful! What's going on
    • Oops! I made a mistake! How can I stop a request
    • My files only contain some error message instead of data
    • My data files are corrupted / I cannot unpack my data
    • My downloads fail with error "All Ready slots are taken and Ready Thread Queue is full"
    • My downloads don't start / time out
    • Http downloads randomly fail with "503 Service Temporarily Unavailable"
    • When selecting a project it fails with "401 - No permission -- see authorization schemes"
    • SRM commands fail with error containing "Java heap space"
    • SRM commands fail with error '426 Connection refused'
    • SRM/Grid commands fail with error 'AC validation failed!' or 'No trusted path can be constructed'
    • SRM/Grid commands fail and I cannot figure out why!
    I did not receive a mail notification that my request was scheduled!

    If the LTA catalog did not show any error when you submitted your request, then it is safe to assume that your request was registered in our staging system. Usually, you should get a notification mail within a few minutes. If you did not receive the notification within an hour of submission, then our staging service may be down. Note that your request is not lost in this case and will be picked up after the service is back online. In urgent cases or if you are not sure that something went wrong while submitting your request, please contact the ASTRON SDC helpdesk.

    I did not receive a mail notification that my data is ready for retrieval! Has my request gone lost?

    After you got a notification that your request was scheduled, it is in our database and there's hardly a possibility that it got lost. Staging requests can take up to a day or two, but will finish a lot sooner in most cases. This depends on your request's size, but also on how busy the storage systems are from other user's requests at that moment. Sometimes, the LTA storage systems are down for maintenance and this can delay the whole procedure. You can check for downtimes here.

    It is not alarming when your request did not finish in 24 hours, even when your last request finished within 10 minutes. In urgent cases or if you did not receive a notification after 48 hours, please contact the ASTRON SDC helpdesk.

    I got an email that says my staging request has failed! What happened?

    This means that the storage backend could not fulfil the request at all. This might mean that the system itself is fine, but none of the files from your request could be staged (e.g. missing files). Check the error message from your mail notification for details. The notification can also indicate that there is a general problem with the storage system or with the staging service itself, i.e. something is broken or down for maintenance. We try to detect all temporary issues and only inform users in case that something is wrong with their request itself, but we cannot foresee all eventualities. If you cannot make sense out of the error message, or don't know how to deal with it, please contact the ASTRON SDC helpdesk.

    If you used the stager API to submit your request, please first check whether you made a mistake, e.g. entered the wrong SURLs.

    Note: We get notified of these issues as well and will usually re-schedule failed requests due to server issues after the problem was solved. So please first check whether you got a 'Data ready for retrieval' notification for the same request id after the error notification. If you did, the problem was already resolved.

    I got an email that says my staging request was only partially successful! What's going on?

    In general, this means that the SRM system works fine, but there was a problem processing your request. As a result, some of your files could be staged, and some could not. Your mail notification should include a list of which files could not be prepared for download successfully together with an error message to provide the cause. If the error message says 'Incorrect URL: host does not match', this means that you combined files in a requests that are stored on two different SRM locations (e.g. one file at SURF (Amsterdam) and one file at Juelich). When one SRM location gets the request, it can only stage the local files. So in order to avoid this error, you have to request the files from different locations independently. Other messages should be self-explanatory, e.g. if a file is missing. If you cannot make sense out of the error message, or don't know how to deal with it, please contact the ASTRON SDC helpdesk.

    If you used the stager API to submit your request, please first check whether you made a mistake, e.g. entered the wrong SURLs.

    Note: We get notified of these issues as well and will usually re-schedule failed requests due to server issues after the problem was solved. So please first check whether you got a 'Data ready for retrieval' notification for the same request id after the error notification. If you did, the problem was already resolved.

    How to stop a staging request?

    It is currently not possible to stop a staging request via the web interface. It is possible to use the stager API (see here) for this. Alternatively, stay calm and ask ASTRON SDC helpdesk to stop the request for you.

    My files only contain some error message instead of data

    Most errors should result in a 404/50x return code. However, some error messages are still returned as a message. Please read the error message carefully. In many cases, it should give you some indication of what went wrong. If this does not help you, please contact the ASTRON SDC helpdesk or retry after a few hours.

    Important: If you use wget with option '-c', please note the following: wget does not check the contents of an existing file, so when restarting wget with option '-c' (continue) to retrieve the failed files, it will append the later data chunk to the existing file that contains the error message (and not the first section of you data). Make sure to delete the existing error files (should be obvious by the small file size) before calling 'wget -ci' again, to avoid corrupted data. If you already ended up with a corrupted file, you have to delete that and re-retrieve the whole file.

    My data files are corrupted

    Check if the files are much smaller than you expect. Something might have gone wrong with the transfer. Please check the beginning of your files, e.g. with the linux 'head' or 'less' commands. If there is an error message, please refer to the question above. Otherwise, please try to re-retrieve an affected file. If this does not help, please contact the ASTRON SDC helpdesk.

    My downloads fail with error "All Ready slots are taken and Ready Thread Queue is full"

    This usually means the storage backend system is overloaded and you should try again in a few hours.

    My downloads don't start / time out

    Maybe the SRM system is down for maintenance, please check the LTA portal or the LOFAR Downtimes page. If there is nothing going on, there is probably something wrong with the download service. Please try again a bit later and submit a support request to the ASTRON SDC helpdesk if the issue persists.

    Http downloads randomly fail with "503 Service Temporarily Unavailable"

    This can indicate too many users downloading at the same time. Please try again a bit later. There is also a limit of simultaneous downloads you are allowed to start yourself. Please limit yourself to four simultaneous downloads, the overall download rate will not improve with a larger number of connections.

    When selecting a project it fails with "401 - No permission – see authorization schemes"

    This happens when you try to select a project when you were not logged into the LTA. Please first select another tab, e.g. search, then try to select your project again.

    SRM commands fail with error containing "Java heap space"

    SRM tools are no longer widely supported and should be avoided. With that warning in mind: the SRM tools ignore the system's default Java heap space settings and the default is not incredibly high. You are probably trying to process a long list of files. Either reduce the amount of files in that request or increase the SRM-specific heap space by setting an environment variable 'SRM_JAVA_OPTIONS' with a higher value (e.g. '-Xms256m -Xmx256m'; default is '-Xms64m -Xmx64m').

    SRM commands fail with error '426 Connection refused'

    SRM tools are no longer widely supported and should be avoided. With that warning in mind: your firewall is probably not allowing active ftp transfers. Make sure that you call srmcp with option '-server_mode=passive'.

    SRM commands fail with error 'srm client error: org.globus.gsi.CredentialException: proxy not found'

    SRM tools are no longer widely supported and should be avoided. With that warning in mind: ensure you have run 'voms-proxy-init to generate an up-to-date proxy file. In case the error persists: The SRM tools apparently do not always use the default proxy file location $HOME/.proxy or you used a non-standard proxy location in voms-proxy-init''.

    • Either set the X509_USER_PROXY environment variable to your .proxy file, e.g.
    export X509_USER_PROXY=$HOME/.proxy
    
    • or pass -x509_user_proxy=<path-to-.proxy-file>, e.g.
    srmcp -x509_user_proxy=$HOME/.proxy <rest-of-command>
    
    SRM/Grid commands fail with error 'AC validation failed!' or 'no trusted path can be constructed'

    SRM tools are no longer widely supported and should be avoided. With that warning in mind: this indicates an issue with creating a secure connection to the server. There is either an issue with your personal certificate/proxy/key or with the set of trusted server certificates.

    • Have you registered at the Lofar VO? You can do that at https://voms.grid.sara.nl:8443/voms/lofar. It is required that you have your Grid certificate installed in your browser for this (http://ca.dutchgrid.nl/info/browser).
    • Make sure your set of server certificates is up to date (see trusted CA certificates).
    • Maybe your private key uses an unsupported algorithm. You might want to try converting it with a command like this: 'openssl rsa -des3 -in .globus/userkey.pem -out .globus/userkey.pem'
    Grid commands fail and I cannot figure out why!

    Retry with options for maximum verbosity and/or debug (see command help and/or man pages), which will print a lot of debug information to stdout. If this does not help you to figure out what is going wrong, submit a support request to the ASTRON SDC helpdesk. (Please refer to the "I want to contact the Science Data Center Operations" on what to include in your request).

     


    Last updated: May 2024

     

    @astron

    SDC Helpdesk