Currently, we provide support to about 550 users involved in more than 100 research projects led by the communities of INFN and the University of Padua. The trend indicates that the number of users and projects grows over time.
Resource usage in the last month
As an indication of the usage of CloudVeneto, we highlight brief descriptions of some of the projects.
Cherenkov telescope images of cosmic and gamma ray showers are ideal for AI analysis to classify events, estimate gamma ray energy, and determine direction. Deep learning, especially convolutional neural networks, is being explored for its ability to detect rare events, outperforming traditional methods in identifying challenging cases like multiple gamma rays or heavy nuclei.
Dr. Rubèn Lopez Coto
The CMS experiment at CERN is searching for new particles like the Higgs boson and measuring particle-antiparticle asymmetries. The Padua CMS group is involved in these studies, analyzing large data sets from the detector. CloudVeneto's computing resources are used for data reconstruction, simulations, statistical analysis, and machine learning models to distinguish signals from background processes.
Dr. Jacopo Pazzini
This project, in collaboration with the E. Mach Institute, analyzed sequence similarity in a database of 300,000 plant protein sequences from commercially important plants like apple, strawberry, and coffee. Self-alignment studies were conducted to cluster biologically significant sequences. CloudVeneto's platform was used to parallelize the task, reducing computation time.
Dr. Ivan Mičetić
The project on the CloudVeneto platform simulates the recognition process between chemical compounds and target proteins using a molecular docking approach. Our laboratory (MMS) has archived a public chemical library of around 5 million compounds for drug candidate screening (MMsINC). Each molecular docking simulation generates approximately 5 plausible ligand-protein complexes per target protein, resulting in around 25 million complexes per study.
Prof. Stefano Moro
Antisymmetrized Molecular Dynamics (AMD) is a code used to simulate nuclear reaction dynamics, incorporating particle structures and correlations. The NUCLEX collaboration utilizes the CloudVeneto infrastructure, running AMD in a virtual machine cluster with parallel processing via OpenMPI. This setup reduces computation time from several months to just 5-7 days for 50,000 events.
Dr. Tommaso Marchi, Dr. Magda Cicerchia
The Padua group, involved in building components for the Large-Sized Telescopes of CTA project, is refining the data analysis software. Using CloudVeneto resources, tools for event reconstruction are being developed, simulating how the telescope observes atmospheric showers and defining methods to determine the direction and energy of gamma rays. CloudVeneto will be used to analyze the first scientific data from the telescope inaugurated in La Palma.
Dr. Rubèn Lopez Coto
The CMS experiment at CERN produces tens of PBs of data annually. The Padua CMS group aims to redefine high-energy physics computing by integrating modern Big Data technologies. Additionally, innovative real-time data acquisition techniques using fast data streaming systems based on Apache Kafka are developed. CloudVeneto provides dedicated clusters for these activities, enhancing data analysis and acquisition efficiency.
Dr. Jacopo Pazzini
The QuantumFuture group at DEI has utilized CloudVeneto for a project involving a quantum random number generator with data rates of tens of Gbps. On CloudVeneto, post-processing and analysis of data from the physical generator were performed. We appreciated having total control over the resources and the ability to dynamically manage the number of machines and resources.
Dr. Marco Avesani
The Computational Biology Lab of the Department of Biomedical Sciences maintains a database of structural annotations for disordered regions in protein sequences, originally covering 80 million sequences. An update is underway to recalculate annotations for over 130 million sequences. CloudVeneto's platform is used to parallelize the annotation pipeline, reducing computation time and speeding up the release cycle.
Dr. Ivan Mičetić
Currently, we provide support to about 550 users involved in more than 100 research projects led by the communities of INFN and the University of Padua.
Currently, we provide support to about 550 users involved in more than 100 research projects led by the communities of INFN and the University of Padua.
The trend indicates that the number of users and projects grows over time.
Some information about the current VCPUs usage
The images of atmospheric showers produced by cosmic rays and gamma rays captured by Cherenkov telescopes are well-suited for analysis by artificial intelligence for the purpose of:
Several deep learning methodologies are under study, particularly convolutional neural networks trained on simulated samples, whose performance will be compared with more traditional analysis methods. The advantage of the “deep learning” approach is the ability to conduct searches for rare and specific events such as multiple gamma rays (bosonic condensates) or images of showers produced by “heavy nuclei” in cosmic rays, which are difficult to classify with analytical methods.
Dr. Rubèn Lopez Coto
Dr. Rubèn Lopez Coto
The CMS (Compact Muon Solenoid) experiment focuses on detecting particles produced in proton collisions generated by the Large Hadron Collider (LHC) at CERN, with the aim of conducting an extensive campaign of measurements. Among these, notable examples include searches for ‘new’ particles, such as the discovery of the Higgs boson, or precision measurements of fundamental properties of nature, as in the case of measuring particle-antiparticle asymmetries.
The CMS group at the Department of Physics and Astronomy of Padua and the INFN section of Padua is actively involved in many of these studies, based on the analysis of vast amounts of data collected by the detector’s sensors.
For the reconstruction and subsequent analysis of this data, the use of computing tools capable of tackling computationally intensive tasks is necessary. These tasks include simulating numerous interactions between particles and the detector’s response, reconstructing events with highly complex topologies, performing statistical analysis for estimating confidence intervals, or developing classifiers based on neural networks for discriminating possible signals from the expected background from known processes in the detector.
To this end, the resources provided by the CloudVeneto infrastructure through dedicated clusters are extensively utilized, including the creation of clusters with ‘elastic’ resource allocation, for the utilization of the experiment’s reconstruction software (CMSSW, based on Python and C++) and development environments for multivariate algorithms, such as TMVA, TensorFlow, and Theano.
Dr. Jacopo Pazzini
The CMS (Compact Muon Solenoid) experiment at CERN studies the outcomes of proton-proton collisions produced by the Large Hadron Collider: having to deal with the analysis of signals from over 70 million detector readout channels every 25 ns, tens of PB of data are produced annually. Analyzing these large datasets to search for rare and highly elusive signals requires intensive use of extensive computing resources, both for the selection and processing of data collected ‘online’ during acquisition, and in the subsequent ‘offline’ data analysis.
The CMS group at the Department of Physics and Astronomy of Padua, and the INFN section of Padua collaborate with CERN with the aim of redefining the computing paradigm in high-energy physics analyses through the integration of modern technologies for processing large datasets, commercially known as Big Data. To achieve this goal, software infrastructures (Apache Spark, Apache Mesos, Kubernetes) are employed to optimize the utilization of available computing resources, resulting in a reduction of several orders of magnitude in data processing time. Additionally, the group is involved in developing innovative techniques for real-time (online) acquisition and processing of the enormous amount of signals directly from experiment sensors through the integration of software systems capable of fast data streaming (based on Apache Kafka) towards computing clusters based on Apache Spark.
The resources of the CloudVeneto infrastructure are used in both of these activities through the creation of dedicated Apache Spark/Mesos and Kubernetes clusters.
Dr. Jacopo Pazzini
The project we are carrying out using the CloudVeneto platform involves simulating the recognition process between chemical compounds and target proteins through a molecular docking approach. Currently, in our laboratory (MMS), we have virtually archived a chemical library of approximately 5 million compounds commonly used in screening for identifying new drug candidates.
The archive, called MMsINC, is in the public domain. During the molecular docking simulation, typically 5 plausible ligand-protein complexes are produced for each selected target protein, resulting in a maximum total of approximately 25 million complexes per case study. In our laboratory, this screening campaign is managed through the distribution of various processes.
We are also planning to launch a web service, which we have already named MMSDockCloud, to serve as an access portal to the computation service described above.
Prof. Stefano Moro
CloudVeneto © 2023 – All rights reserved