This portal is developed and maintained at CCIN2P3 on behalf of EGI.eu.

Release Version : 3.1.2 - March 23th

For any contact, use this section: contact us

The Operations Portal is a central portal for the EGI operations management that offers a different capabilities, such as the broadcast tool, VO management facilities and various of dashboards (Security, VO and Operations) to facilitate infrastructure oversight.

Latest news

1. Site recertification after security site suspension 

After site suspension the recertification process should take place. It is performed by ROD team https://wiki.egi.eu/wiki/PROC09 . We remind you that in case of security site suspension recertification must be approved by EGI CSIRT. 

2. Torque 4 

Sites, that arestill on Torque version < 2.5.13 (if any) are encouraged to update to the Torque 2.5 version patched by SVG, configuring the AppDB repository with priority higher than EPEL, so that the Torque 4 released by EPEL will never be installed automatically. 

Sites that updated to Torque 4 and want to downgrade back to 2.5 will use the SVG version of Torque 2.5.13 above as well. 

Sites that chose Torque 4 and have it 100% working can keep it. 


3. The UMD 3.12.0 update was released in the UMD production repositories on May 5, 2015. 
The products released in this update are: 
* ARGUS-PAP, v. 1.6.4, versions for SL5 and SL6 

* dCache server, v.2.10.24, versions for SL5 and SL6 
Please visit: http://repository.egi.eu/2015/05/05/release-umd-3-12-0/ for details of each product changes. 

4. EGI Conference http://conf2015.egi.eu/ Registration deadline 11.05.2015
The next EGI conference, which will be held in Lisbon, Portugal between 18-22 May 2015, is approaching. It promises to be a great occasion for new and current users of the EGI infrastructure and resource and service providers to meet and discuss requirements, roadmaps, features, and much more.
EGI Operations would like to highlight following sessions which should be interesting for Operations teams:
• Security training
• Tools for operating e-Infrastructure
   – Requirements and plans for the evolution of the EGI Core Infrastructure
• EGI federated cloud: state of the art, demos and production use cases 
   – Status of the EGI Federated Cloud, the capabilities currently offered and how different scientific communities are benefiting from them
• Data accounting
   – Requirements and use cases to add data accounting to the EGI/APEL accounting system
• Federated accelerated computing
   – Status of the art  and barriers with providers of GPGPUs or MIC co-processors in EGI
• Federated Cloud IaaS - towards an international multi-disciplinary science cloud
   – Integration activities that will advance the IaaS capabilities of the EGI Federated Cloud
• Federated Operations Solution
   – Presentation of capabilities offered by the EGI Federated Operations Solution and their future evolution.
• Cloud PaaS - all user communities meeting
   – Moving EGI towards an ecosystem where PaaS and SaaS services are operated alongside IaaS offerings
• Service Level Management for federated e-Infrastructures
   – Current status of resource allocation processes in EGI, requirements gathering and collecting feedback on existing solutions
• Security for cloud federations
   – Definition of the security profile of the EGI federated cloud
•  EGI Marketplace
   – Presentation of the basic concepts and to gather input into the requirements for the EGI Marketplace.
• EGI service to support individual researchers or small collaborations 
   – Presentation of current status of work being done through demonstration of LTOS platform usage and discuss next step of launching the trial. 
• Towards an Open Data Cloud
   – Requirements for the realization of distributed "research data cloud" that brings cloud and grid computing close to data for scalable access, use and reuse of research data.
• AAI - all user communities meeting
   – Current state of the art of the e-infrastructures, in terms of AAI solutions, and the roadmaps for the evolution

Best regards, 
EGI Operations Support Team 
1. Self Assessment of the NGI/Site Security Teams 
At your disposal is available survey to check security teams maturity of your Resource Center: https://check.ncsc.nl/questionnaire/
EGI CSIRT team is looking for at least 10 EGI Resource Centers willing to fill it in. We will be grateful for your input. 

2. Site recertification after security site suspension 

After site suspension the recertification process should take place. It is performed by ROD team https://wiki.egi.eu/wiki/PROC09 . We remind you that in case of security site suspension recertification must be approved by EGI CSIRT. 

3. Torque 4 

Sites, that arestill on Torque version < 2.5.13 (if any) are encouraged to update to the Torque 2.5 version patched by SVG, configuring the AppDB repository with priority higher than EPEL, so that the Torque 4 released by EPEL will never be installed automatically. 

Sites that updated to Torque 4 and want to downgrade back to 2.5 will use the SVG version of Torque 2.5.13 above as well. 

Sites that chose Torque 4 and have it 100% working can keep it. 

4. The UMD 3.12.0 update was released in the UMD production repositories on May 5, 2015. 
The products released in this update are: 
* ARGUS-PAP, v. 1.6.4, versions for SL5 and SL6 

* dCache server, v.2.10.24, versions for SL5 and SL6 
Please visit: http://repository.egi.eu/2015/05/05/release-umd-3-12-0/ for details of each product changes. 

5. 4. EGI Conference http://conf2015.egi.eu/ Registration deadline 11.05.2015
The next EGI conference, which will be held in Lisbon, Portugal between 18-22 May 2015, is approaching. It promises to be a great occasion for new and current users of the EGI infrastructure and resource and service providers to meet and discuss requirements, roadmaps, features, and much more.
EGI Operations would like to highlight following sessions which should be interesting for Operations teams:
• Security training
• Tools for operating e-Infrastructure
   – Requirements and plans for the evolution of the EGI Core Infrastructure
• EGI federated cloud: state of the art, demos and production use cases 
   – Status of the EGI Federated Cloud, the capabilities currently offered and how different scientific communities are benefiting from them
• Data accounting
   – Requirements and use cases to add data accounting to the EGI/APEL accounting system
• Federated accelerated computing 
   – Status of the art  and barriers with providers of GPGPUs or MIC co-processors in EGI
• Federated Cloud IaaS - towards an international multi-disciplinary science cloud
   – Integration activities that will advance the IaaS capabilities of the EGI Federated Cloud
• Federated Operations Solution
   – Presentation of capabilities offered by the EGI Federated Operations Solution and their future evolution.
• Cloud PaaS - all user communities meeting
   – Moving EGI towards an ecosystem where PaaS and SaaS services are operated alongside IaaS offerings
• Service Level Management for federated e-Infrastructures
   – Current status of resource allocation processes in EGI, requirements gathering and collecting feedback on existing solutions
• Security for cloud federations
   – Definition of the security profile of the EGI federated cloud
• EGI Marketplace
   – Presentation of the basic concepts and to gather input into the requirements for the EGI Marketplace.
• EGI service to support individual researchers or small collaborations 
   – Presentation of current status of work being done through demonstration of LTOS platform usage and discuss next step of launching the trial. 
• Towards an Open Data Cloud
   – Requirements for the realization of distributed "research data cloud" that brings cloud and grid computing close to data for scalable access, use and reuse of research data.
• AAI - all user communities meeting
   – Current state of the art of the e-infrastructures, in terms of AAI solutions, and the roadmaps for the evolution

Best regards, 
EGI Operations Support Team 
Dear Colleagues,.
we plan to shutdown 3 CEs of our DESY-HH Grid infrastructure with a
number of rather old WNs. The info system and the GOCDB will be adjusted accordingly.
The bulk comptung resources at DESY-HH will remain running and can be reached as before.

Regards
Andreas Gellrich fir the DESY Grid team
There has been another incidence of lost accounting data in the message broker network in April.  In March there were only a small number of sites affected and we contacted them directly with GGUS tickets. For April the number is somewhat larger so we will raise tickets against the NGIs leaving them to contact the affected sites directly. 

Our signature for this problem is that sites went CRITICAL for the Nagios APEL_Sync test on 3rd May  http://bit.ly/1GKzigW  but some sites who had been CRITICAL for a longer period masked this signature and are not on our list. If you are completely red on the page linked to above, you can cross check for your site with the web view of the APEL_Sync Test   http://bit.ly/1PiuQfS 

Advice on republishing is given here https://wiki.egi.eu/wiki/APEL/Gaps-April2015 

We believe that we have mitigated future problems but we are still investigating with the Message Broker Team. 

The APEL Team
Dear colleagues,

this Tue Apr 28 the old aliases (lcg-)voms.cern.ch have been removed.

Below are updated reminders concerning the switch to the new VOMS servers:

UI configuration
================

Since the old VOMS services (lcg-)voms.cern.ch cannot be used to obtain
VOMS proxies anymore, it is desirable to remove any references to those
services from the VOMS client configuration on User Interface hosts.
The configuration files are by default located in the /etc/vomses directory.

Note: the YAIM configuration tool will only _create_ such files according
to its configuration --> stale files will have to be _manually_ removed.

The VOMS clients will skip over services that are not available (anymore),
but in such cases the user may get confusing error messages.

Grid-mapfile configuration
==========================

One of the old services (voms.cern.ch) was also being used for creating
various types of grid-mapfiles still used by some grid services.

That functionality (VOMS-Admin) was available until today (Tue Apr 28).

Grid-mapfiles are typically created by the edg-mkgridmap utility that is
robust against a VOMS server being temporarily unavailable for some VO:
in such cases it will _not_ delete any DN for the affected VO.

In the current case, however, the given VOMS server will not come back
and therefore should instead be removed from the configuration.

LSC files
=========

LSC files referring to the old VOMS servers are simply ignored;
it is therefore not needed to remove them explicitly.

VOMS configuration details
==========================

The details for the new situation are described on this page:

    https://twiki.cern.ch/twiki/bin/view/LCG/VOMSLSCfileConfiguration

Also the VO cards of the LHC experiments and the Ops VO are up to date.
Dear VO managers

In 2013 the EGI core activities have been selected by the EGI Council as critical for the support of the EGI production infrastructure. The goal of the selection was to ensure sustainability of critical components of the infrastructure by dedicated founding not depending on any project.
Since April 2014 EGI Core services are provided by selected through bidding process EGI Partners.

EGI has prepared a survey targeting EGI Virtual Organizations' contacts (VO manager) which has the goal of assessment what is the impact of the services on Virtual Organizations, and if the priorities defined almost two years ago are still valid:
https://www.surveymonkey.com/r/EGI_core_services_VO_assessment

Please, discuss together with your team the survey and provide - possibly - one answer per Virtual Organization. 
We are looking forward for your answers till: 1.05.2015

For every service we will ask you the following questions:

* How much has this service been used by your Virtual Organization.
* Is this service critical for the operations of your Virtual Organization?

In addition we will ask you to assess other services which are not part of the EGI Core services group to understand if they should be taken into account in next period.

The output of this survey will be used, among other inputs, to define the set of core services that will be planned for provisioning starting from Spring 2016.

Thank you for filling in our survey. 
Your input is very important for us to support you.

Małgorzata Krakowian
EGI Operations

>>> More news <<<

Dear all,

the old torque/maui-based batch system behind CE lcg-cream.ifh.de will finally retire on 4th May 2015.

Please change to the new CEs:

grid-cream1.zeuthen.desy.de & grid-cream2.zeuthen.desy.de

if not done so already. These serve a Univa Gridengine 8.2 batch system.

Cheers,
Your DESY-ZN site admins
The CREAM CEs at the RAL Tier1 (RAL-LCG2) will be decommissioned on Tuesday 5th May 2015. On this day job submissions to these CEs will be stopped. Once all active jobs have finished the CEs will be turned off.
The CEs are:
cream-ce01.gridpp.rl.ac.uk
cream-ce01.gridpp.rl.ac.uk
From this date only ARC CEs will be provided at this site.

A downtime has been declared in the GOC DB for these services starting on 5th May.
Hi,

=Summary=
A large number of sites are failing the APEL Sync test for March 2015. Please do not republish until asked to.

=Detail=
An unusually large number of sites have failed the APEL Sync test for March. The Nagios test tests that all accounting data at a site has been loaded to the central accounting repository. The APEL accounting team are investigating the cause of the problem and are testing a fix with a small number of sites. Please do not instruct sites to republish until further instructions from the accounting team. If a large number of sites republish at the same time, it will slow down the system for everyone.

I apologise for the inconvenience and thank you for your patience.
Stuart Pullinger
APEL Accounting Team Leader
Dear all,

EUGridPMA have announced a new set of CA rpms. Based on this IGTF release a new set of CA RPMs have been packaged for EGI. 

Please upgrade within the next six days at your earliest convenience. When this timeout is over, SAM will throw critical errors on CA tests if old CAs are still detected.

Please check https://wiki.egi.eu/wiki/EGI_IGTF_Release for more details

EGI UMD software provisioning Team

Release Notes:

European Grid Infrastructure EGI Trust Anchor release 1.63          2015.03.30

------------------------------------------------------------------------------
   For release DOCUMENTATION available on this EGI Trust Anchor release see   
               https://wiki.egi.eu/wiki/EGI_IGTF_Release                      
------------------------------------------------------------------------------

Modifications compared to the previous release:
* updated to IGTF Accredited CA distribution version 1.63-1 Classic, SLCS and 
  MICS profiles, encoded in meta-package "ca-policy-egi-core-1.63-1" (for new 
  installs) and "lcg-CA-1.63-1" (for sites upgrading from EGEE/LCG releases).

* Location of the repository changed to repository.egi.eu.  See documentation
  for details and the updated repo files.

* Your may install BOTH the "egi-core" AND "lcg" meta-packages,  according to
  your  policies.   Note that  your organisation or  NGI may have  a specific 
  policy and may have added or removed CAs compared to the EGI core policy.

The following notices are republished from the IGTF and EUGridPMA, inasfar 
as pertinent to this release.  More information can be found in the 
EUGridPMA newsletter (see https://www.eugridpma.org/):

Changes from 1.62 to 1.63
-------------------------
(30 March 2015)

* Removed obsoleted and replaced NIIF CA (HU)
* Extended validity period of the KEK CA (JP)
* Removed obsoleted d254cc30/CERN-Root 1d879c6c/CERN-TCA anchors (CERN)
* Updated RPDNC namespaces to permit DigiCert Grid Trust G2 ICAs for
  DigiCert Assured ID Root CA (US)
* Updated RPDNC namespaces and signing_policy files for G2 series
  DigiCert Grid CAs pending ICA reissuance for reverse RDN issue (US)
* Normalised cond_subject syntax for multiple signing policy files
   cilogon-basic cilogon-silver InCommon-IGTF-Server-CA NCSA-slcs-2013
   NCSA-tfca-2013 Comodo-RSA-CA

The CA modifications, encoded in both "requires" and "obsoletes" clauses, have
been incorporated in the above-mentioned meta-package RPMs. This trust anchor 
release is best enjoyed with fetch-crl v3 or better, available from popular 
GNU/Linux OS (add-on) repositories Fedora, EPEL, Debian, and from the IGTF.

Version information: ca-policy-egi-core = 1.63-1
The problem at the RAL Tier1 that stopped access to a number of database machines has been fixed. Our Castor storage and Atlas Frontier services are available again. The GOC DB is also available again. 
We are investigating a problem at the RAL Tier1. We have lost access to a number of database machines. At the moment our Castor storage and Atlas Frontier services are not available. The GOC DB is also affected and is not available. 
Dear VO Managers and Users,

We inform you that we are beginning the decommissioning process (https://wiki.egi.eu/wiki/PROC12) for the grid services at Resource Centre UPV-GRyCAP in NGI_IBERGRID.

Services affected:

ngiescream.i3m.upv.es (CREAM)
ngiesse.i3m.upv.es (SE)
ngiesmon.i3m.upv.es (gLite-APEL)

Timeline:

25th March 2015:
Due to some internal problems the CREAM node is now in swcheduled downtime and the submission of jobs has been disabled.
SE node file writting is disabled.

15rd April 2015
CREAM service will be removed
SE node will be set in swcheduled downtime
gLite-APEL node will be set in swcheduled downtime

20rd April 2015
gLite-APEL service will be removed

25th May 2015:
SE service will be removed
Dear Users,

The Operations Portal  3.1.2 is now online.
The main features  are  :

1 ) the possibility to set a Master alarm for a ticket : when you create a  new ROD/MW ticket, select one of the alarm to set the subject of the title from the details of this alarm

2)  the possibility to use a quick mode
       - for Tickets (Create, Update, Escalation, Close)
       -  for Notepads (Create, Update, Close)
       -  for Alarm closure

The form are now pre-filled and simplify, you access to it trough a pop-up window and once submitted you are going back to main page.
You can use the verbose mode to see more options.

3) an Additional Carbon Copy : you can  create a ticket with and additional CC (Available in verbose mode)

Here is the complete list for all features / fixes  :
http://operations-portal.egi.eu/home/tasksList/release_id/10

Don't hesitate to contact us for comments, feedback, bugs at cic-information@in2p3.fr

Thanks by advance


Cheers,
Hi,

One of the services run by the EGI central accounting repository failed to restart correctly after a dropped connection. The connection to receive data from the EGI message broker was dropped on Friday evening (2015-03-13 18:03 UTC). The accounting data has been buffered on the message brokers but has not been downloaded to the accounting repository until Monday morning (2015-03-16 09:00 UTC).

Once the backlog of data has been loaded, it will be summarised, then the accounting portal will be updated. We expect this to take until the evening on Monday.

This problem affects grid accounting data only. The cloud accounting service recovered correctly and continued to run. We will be investigating how we can prevent this from occurring in the future.

I apologise for the inconvenience.
Stuart Pullinger
Accounting Team Leader